2025-12-04T10:32:01.2978283Z Current runner version: '2.329.0' 2025-12-04T10:32:01.2981220Z Runner name: 'linux.rocm.gpu.gfx942.4.b-bphpw-runner-5l4hk' 2025-12-04T10:32:01.2981642Z Runner group name: 'default' 2025-12-04T10:32:01.2982038Z Machine name: 'linux' 2025-12-04T10:32:01.2983173Z ##[group]GITHUB_TOKEN Permissions 2025-12-04T10:32:01.2984193Z Contents: read 2025-12-04T10:32:01.2984656Z Metadata: read 2025-12-04T10:32:01.2984878Z ##[endgroup] 2025-12-04T10:32:01.2985890Z Secret source: Actions 2025-12-04T10:32:01.2986199Z Prepare workflow directory 2025-12-04T10:32:01.3220522Z Prepare all required actions 2025-12-04T10:32:01.3240193Z Getting action download info 2025-12-04T10:32:01.8070760Z Download action repository 'pytorch/pytorch@main' (SHA:c0cb6e78404416d418350632bfc554710a5f7281) 2025-12-04T10:32:06.5433626Z Download action repository 'pytorch/test-infra@main' (SHA:39aa74d619174326f4e2fb0e216151c2f29d9ffd) 2025-12-04T10:32:07.8911741Z Download action repository 'actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T10:32:09.0280876Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722) 2025-12-04T10:32:10.0993856Z Getting action download info 2025-12-04T10:32:10.2806092Z Download action repository 'actions/checkout@v4' (SHA:34e114876b0b11c390a56381ad16ebd13914f8d5) 2025-12-04T10:32:11.2750728Z Getting action download info 2025-12-04T10:32:11.5345685Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e) 2025-12-04T10:32:12.4470373Z Getting action download info 2025-12-04T10:32:12.6435277Z Uses: pytorch/pytorch/.github/workflows/_rocm-test.yml@refs/heads/main (ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32) 2025-12-04T10:32:12.6437421Z ##[group] Inputs 2025-12-04T10:32:12.6437570Z build-environment: linux-jammy-rocm-py3.10 2025-12-04T10:32:12.6440943Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T10:32:12.6444321Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:12.6444611Z sync-tag: 2025-12-04T10:32:12.6445143Z timeout-minutes: 300 2025-12-04T10:32:12.6445278Z tests-to-include: 2025-12-04T10:32:12.6445393Z dashboard-tag: 2025-12-04T10:32:12.6445634Z disable-monitor: true 2025-12-04T10:32:12.6445751Z monitor-log-interval: 5 2025-12-04T10:32:12.6445873Z monitor-data-collect-interval: 1 2025-12-04T10:32:12.6446005Z ##[endgroup] 2025-12-04T10:32:12.6446220Z Complete job name: linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:32:12.6732161Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main 2025-12-04T10:32:12.6732443Z with: 2025-12-04T10:32:12.6732544Z no-sudo: true 2025-12-04T10:32:12.6732642Z submodules: recursive 2025-12-04T10:32:12.6732746Z fetch-depth: 0 2025-12-04T10:32:12.6732887Z env: 2025-12-04T10:32:12.6732981Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:12.6733102Z ##[endgroup] 2025-12-04T10:32:12.6777718Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T10:32:12.6778107Z echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT" 2025-12-04T10:32:12.6784950Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:12.6785108Z env: 2025-12-04T10:32:12.6785208Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:12.6785315Z ##[endgroup] 2025-12-04T10:32:12.6945808Z ##[group]Run actions/checkout@v4 2025-12-04T10:32:12.6946006Z with: 2025-12-04T10:32:12.6946132Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:12.6946271Z fetch-depth: 0 2025-12-04T10:32:12.6946369Z submodules: recursive 2025-12-04T10:32:12.6946568Z show-progress: false 2025-12-04T10:32:12.6946683Z repository: pytorch/pytorch 2025-12-04T10:32:12.6947035Z token: *** 2025-12-04T10:32:12.6947131Z ssh-strict: true 2025-12-04T10:32:12.6947217Z ssh-user: git 2025-12-04T10:32:12.6947316Z persist-credentials: true 2025-12-04T10:32:12.6947422Z clean: true 2025-12-04T10:32:12.6947534Z sparse-checkout-cone-mode: true 2025-12-04T10:32:12.6947657Z fetch-tags: false 2025-12-04T10:32:12.6947753Z lfs: false 2025-12-04T10:32:12.6947838Z set-safe-directory: true 2025-12-04T10:32:12.6947943Z env: 2025-12-04T10:32:12.6948028Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:12.6948129Z ##[endgroup] 2025-12-04T10:32:12.7486151Z Syncing repository: pytorch/pytorch 2025-12-04T10:32:12.7486731Z ##[group]Getting Git version info 2025-12-04T10:32:12.7486913Z Working directory is '/home/runner/_work/pytorch/pytorch' 2025-12-04T10:32:12.7487175Z [command]/usr/bin/git version 2025-12-04T10:32:12.7487292Z git version 2.52.0 2025-12-04T10:32:12.7499607Z ##[endgroup] 2025-12-04T10:32:12.7505087Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/37b56e8c-a39e-4c9e-9610-87829546a25e/.gitconfig' 2025-12-04T10:32:12.7511050Z Temporarily overriding HOME='/home/runner/_work/_temp/37b56e8c-a39e-4c9e-9610-87829546a25e' before making global git config changes 2025-12-04T10:32:12.7511563Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T10:32:12.7514004Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T10:32:12.7541354Z [command]/usr/bin/git config --local --get remote.origin.url 2025-12-04T10:32:12.7555671Z https://github.com/pytorch/pytorch 2025-12-04T10:32:12.7574123Z ##[group]Removing previously created refs, to avoid conflicts 2025-12-04T10:32:12.7577400Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-12-04T10:32:12.7597971Z refs/heads/main 2025-12-04T10:32:12.7608885Z [command]/usr/bin/git checkout --detach 2025-12-04T10:32:14.3195057Z HEAD is now at c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452) 2025-12-04T10:32:14.3232061Z [command]/usr/bin/git branch --delete --force main 2025-12-04T10:32:14.3372023Z Deleted branch main (was c0cb6e784044). 2025-12-04T10:32:14.3377371Z ##[endgroup] 2025-12-04T10:32:14.3379760Z [command]/usr/bin/git submodule status 2025-12-04T10:32:14.3568020Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-12-04T10:32:14.3625193Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-12-04T10:32:14.3665823Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-12-04T10:32:14.3718134Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-12-04T10:32:14.3753026Z 3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93) 2025-12-04T10:32:14.3811143Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-12-04T10:32:14.4128247Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-12-04T10:32:14.4151859Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-12-04T10:32:14.4171862Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-12-04T10:32:14.4228877Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-12-04T10:32:14.4304230Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-12-04T10:32:14.4374944Z f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30) 2025-12-04T10:32:14.4412261Z 0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c) 2025-12-04T10:32:14.4477143Z f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1) 2025-12-04T10:32:14.4501286Z c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39) 2025-12-04T10:32:14.4560223Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-12-04T10:32:14.4573024Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-12-04T10:32:14.4806232Z 407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0) 2025-12-04T10:32:14.4876362Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-12-04T10:32:14.4947915Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-12-04T10:32:14.5087923Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-12-04T10:32:14.5161753Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-12-04T10:32:14.5211342Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-12-04T10:32:14.5326063Z 31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main) 2025-12-04T10:32:14.5348217Z d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0) 2025-12-04T10:32:14.5364862Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-12-04T10:32:14.5382332Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-12-04T10:32:14.5590181Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-12-04T10:32:14.5608375Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-12-04T10:32:14.5626940Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-12-04T10:32:14.5842344Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-12-04T10:32:14.5889531Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-12-04T10:32:14.5926721Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-12-04T10:32:14.5943421Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-12-04T10:32:14.5987032Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-12-04T10:32:14.6040617Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-12-04T10:32:14.6088447Z 2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main) 2025-12-04T10:32:14.6097247Z ##[group]Cleaning the repository 2025-12-04T10:32:14.6101637Z [command]/usr/bin/git clean -ffdx 2025-12-04T10:32:14.6228803Z [command]/usr/bin/git reset --hard HEAD 2025-12-04T10:32:14.6968273Z HEAD is now at c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452) 2025-12-04T10:32:14.7024884Z ##[endgroup] 2025-12-04T10:32:14.7027603Z ##[group]Disabling automatic garbage collection 2025-12-04T10:32:14.7031613Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T10:32:14.7064245Z ##[endgroup] 2025-12-04T10:32:14.7064434Z ##[group]Setting up auth 2025-12-04T10:32:14.7068113Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T10:32:14.7085348Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T10:32:14.7275946Z Entering 'android/libs/fbjni' 2025-12-04T10:32:14.7305622Z Entering 'third_party/FP16' 2025-12-04T10:32:14.7330167Z Entering 'third_party/FXdiv' 2025-12-04T10:32:14.7366451Z Entering 'third_party/NNPACK' 2025-12-04T10:32:14.7403701Z Entering 'third_party/NVTX' 2025-12-04T10:32:14.7439845Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:14.7465180Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:14.7494426Z Entering 'third_party/aiter' 2025-12-04T10:32:14.7524502Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:14.7567008Z Entering 'third_party/benchmark' 2025-12-04T10:32:14.7592632Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:14.7628012Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:14.7671122Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:14.7699534Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:14.7736205Z Entering 'third_party/cutlass' 2025-12-04T10:32:14.7764287Z Entering 'third_party/fbgemm' 2025-12-04T10:32:14.7790497Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:14.7822009Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:14.7854040Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:14.7879299Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:14.7914919Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:14.7943213Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:14.7972768Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:14.8008111Z Entering 'third_party/flash-attention' 2025-12-04T10:32:14.8033309Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:14.8067590Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:14.8099382Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:14.8124055Z Entering 'third_party/fmt' 2025-12-04T10:32:14.8156233Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:14.8194803Z Entering 'third_party/gloo' 2025-12-04T10:32:14.8220774Z Entering 'third_party/googletest' 2025-12-04T10:32:14.8243438Z Entering 'third_party/ideep' 2025-12-04T10:32:14.8266002Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:14.8304465Z Entering 'third_party/ittapi' 2025-12-04T10:32:14.8327358Z Entering 'third_party/kineto' 2025-12-04T10:32:14.8351899Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:14.8376351Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:14.8399190Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:14.8426428Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:14.8454114Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:14.8480082Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:14.8500600Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:14.8522215Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:14.8547447Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:14.8569223Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:14.8597584Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:14.8636962Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:14.8665380Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:14.8691910Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:14.8713844Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:14.8737562Z Entering 'third_party/kleidiai' 2025-12-04T10:32:14.8760284Z Entering 'third_party/mimalloc' 2025-12-04T10:32:14.8788343Z Entering 'third_party/nlohmann' 2025-12-04T10:32:14.8813342Z Entering 'third_party/onnx' 2025-12-04T10:32:14.8857716Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:14.8898226Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:14.8928902Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:14.8960677Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:14.8990894Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:14.9016803Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:14.9039357Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:14.9058874Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:14.9077401Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:14.9098414Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:14.9122909Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:14.9154243Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:14.9192527Z Entering 'third_party/pocketfft' 2025-12-04T10:32:14.9221986Z Entering 'third_party/protobuf' 2025-12-04T10:32:14.9253070Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:14.9284947Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:14.9319047Z Entering 'third_party/psimd' 2025-12-04T10:32:14.9345901Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:14.9376437Z Entering 'third_party/pybind11' 2025-12-04T10:32:14.9398091Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:14.9422673Z Entering 'third_party/sleef' 2025-12-04T10:32:14.9452552Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:14.9476953Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:14.9498545Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:14.9520298Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:14.9541144Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:14.9564904Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:14.9603588Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T10:32:14.9621915Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T10:32:14.9786471Z Entering 'android/libs/fbjni' 2025-12-04T10:32:14.9810603Z Entering 'third_party/FP16' 2025-12-04T10:32:14.9835925Z Entering 'third_party/FXdiv' 2025-12-04T10:32:14.9857612Z Entering 'third_party/NNPACK' 2025-12-04T10:32:14.9880292Z Entering 'third_party/NVTX' 2025-12-04T10:32:14.9902201Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:14.9930229Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:14.9958038Z Entering 'third_party/aiter' 2025-12-04T10:32:14.9981157Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:15.0006873Z Entering 'third_party/benchmark' 2025-12-04T10:32:15.0029028Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:15.0054434Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:15.0083694Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:15.0107793Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:15.0128719Z Entering 'third_party/cutlass' 2025-12-04T10:32:15.0153452Z Entering 'third_party/fbgemm' 2025-12-04T10:32:15.0175813Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:15.0205468Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:15.0238479Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:15.0268293Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:15.0294480Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:15.0320956Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:15.0349838Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:15.0372895Z Entering 'third_party/flash-attention' 2025-12-04T10:32:15.0396246Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:15.0424637Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:15.0461401Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:15.0490468Z Entering 'third_party/fmt' 2025-12-04T10:32:15.0517558Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:15.0546494Z Entering 'third_party/gloo' 2025-12-04T10:32:15.0568561Z Entering 'third_party/googletest' 2025-12-04T10:32:15.0592182Z Entering 'third_party/ideep' 2025-12-04T10:32:15.0621121Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:15.0648106Z Entering 'third_party/ittapi' 2025-12-04T10:32:15.0671788Z Entering 'third_party/kineto' 2025-12-04T10:32:15.0695837Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:15.0721879Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:15.0754397Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:15.0778224Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:15.0801949Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:15.0834526Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:15.0858848Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:15.0893409Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:15.0920534Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:15.0950741Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:15.0977114Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:15.1009409Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:15.1040635Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:15.1071220Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:15.1093727Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:15.1122666Z Entering 'third_party/kleidiai' 2025-12-04T10:32:15.1144517Z Entering 'third_party/mimalloc' 2025-12-04T10:32:15.1166056Z Entering 'third_party/nlohmann' 2025-12-04T10:32:15.1191659Z Entering 'third_party/onnx' 2025-12-04T10:32:15.1225841Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:15.1257264Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:15.1294464Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:15.1324314Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:15.1361276Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:15.1392562Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:15.1419253Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:15.1441029Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:15.1467173Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:15.1489642Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:15.1512054Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:15.1534630Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:15.1562559Z Entering 'third_party/pocketfft' 2025-12-04T10:32:15.1583854Z Entering 'third_party/protobuf' 2025-12-04T10:32:15.1605545Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:15.1626940Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:15.1652393Z Entering 'third_party/psimd' 2025-12-04T10:32:15.1677102Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:15.1699091Z Entering 'third_party/pybind11' 2025-12-04T10:32:15.1720659Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:15.1741176Z Entering 'third_party/sleef' 2025-12-04T10:32:15.1762474Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:15.1783149Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:15.1805914Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:15.1840438Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:15.1861975Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:15.1889607Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:15.1930939Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.1956179Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T10:32:15.2122397Z Entering 'android/libs/fbjni' 2025-12-04T10:32:15.2138803Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T10:32:15.2147263Z Entering 'third_party/FP16' 2025-12-04T10:32:15.2162383Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T10:32:15.2171578Z Entering 'third_party/FXdiv' 2025-12-04T10:32:15.2183524Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T10:32:15.2191628Z Entering 'third_party/NNPACK' 2025-12-04T10:32:15.2206174Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T10:32:15.2215094Z Entering 'third_party/NVTX' 2025-12-04T10:32:15.2226813Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T10:32:15.2236306Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:15.2248068Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T10:32:15.2255947Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:15.2266211Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T10:32:15.2281353Z Entering 'third_party/aiter' 2025-12-04T10:32:15.2291715Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T10:32:15.2301464Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:15.2314944Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T10:32:15.2334793Z Entering 'third_party/benchmark' 2025-12-04T10:32:15.2347493Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:15.2355942Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:15.2365822Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T10:32:15.2377769Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:15.2390927Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T10:32:15.2400795Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:15.2410753Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T10:32:15.2420981Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:15.2439688Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T10:32:15.2454157Z Entering 'third_party/cutlass' 2025-12-04T10:32:15.2466835Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T10:32:15.2481381Z Entering 'third_party/fbgemm' 2025-12-04T10:32:15.2499632Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T10:32:15.2512325Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:15.2525763Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T10:32:15.2534507Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:15.2548950Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T10:32:15.2561186Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:15.2579514Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T10:32:15.2590850Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:15.2608468Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T10:32:15.2621741Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:15.2636855Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T10:32:15.2648693Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:15.2662421Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T10:32:15.2673787Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:15.2685254Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T10:32:15.2697784Z Entering 'third_party/flash-attention' 2025-12-04T10:32:15.2707081Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T10:32:15.2715058Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:15.2724958Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T10:32:15.2736768Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:15.2746999Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T10:32:15.2759781Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:15.2773581Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T10:32:15.2790948Z Entering 'third_party/fmt' 2025-12-04T10:32:15.2804185Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T10:32:15.2817112Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:15.2827208Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T10:32:15.2837274Z Entering 'third_party/gloo' 2025-12-04T10:32:15.2852199Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T10:32:15.2863680Z Entering 'third_party/googletest' 2025-12-04T10:32:15.2877966Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:15.2890464Z Entering 'third_party/ideep' 2025-12-04T10:32:15.2902069Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T10:32:15.2910886Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:15.2923470Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T10:32:15.2938399Z Entering 'third_party/ittapi' 2025-12-04T10:32:15.2958603Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T10:32:15.2969854Z Entering 'third_party/kineto' 2025-12-04T10:32:15.2984835Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T10:32:15.2995112Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:15.3005173Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T10:32:15.3015971Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:15.3028298Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T10:32:15.3039048Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:15.3051576Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T10:32:15.3061303Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:15.3072239Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T10:32:15.3081238Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:15.3098193Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T10:32:15.3111410Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:15.3127141Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T10:32:15.3142189Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:15.3155598Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T10:32:15.3169719Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:15.3186404Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:15.3198068Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:15.3213707Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T10:32:15.3224457Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:15.3236072Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T10:32:15.3251015Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:15.3266226Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T10:32:15.3281351Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:15.3292868Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T10:32:15.3304856Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:15.3320886Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T10:32:15.3335794Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:15.3348318Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T10:32:15.3357720Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:15.3367272Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T10:32:15.3377828Z Entering 'third_party/kleidiai' 2025-12-04T10:32:15.3392398Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T10:32:15.3402264Z Entering 'third_party/mimalloc' 2025-12-04T10:32:15.3412937Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T10:32:15.3422093Z Entering 'third_party/nlohmann' 2025-12-04T10:32:15.3433371Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T10:32:15.3444891Z Entering 'third_party/onnx' 2025-12-04T10:32:15.3454846Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T10:32:15.3471467Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:15.3481988Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:15.3500377Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:15.3510192Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T10:32:15.3519784Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:15.3532156Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:15.3542269Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:15.3553432Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:15.3563131Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:15.3582144Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T10:32:15.3592099Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:15.3606072Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T10:32:15.3621878Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:15.3645644Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T10:32:15.3655062Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:15.3668172Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T10:32:15.3678444Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:15.3690046Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T10:32:15.3702936Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:15.3712691Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T10:32:15.3722271Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:15.3738266Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T10:32:15.3753393Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:15.3763449Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T10:32:15.3780318Z Entering 'third_party/pocketfft' 2025-12-04T10:32:15.3790443Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T10:32:15.3800016Z Entering 'third_party/protobuf' 2025-12-04T10:32:15.3809342Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T10:32:15.3819099Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:15.3832658Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:15.3842618Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:15.3864011Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:15.3873331Z Entering 'third_party/psimd' 2025-12-04T10:32:15.3883807Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T10:32:15.3892961Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:15.3904024Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T10:32:15.3916038Z Entering 'third_party/pybind11' 2025-12-04T10:32:15.3931880Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:15.3941475Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:15.3951602Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T10:32:15.3966581Z Entering 'third_party/sleef' 2025-12-04T10:32:15.3979560Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T10:32:15.3988727Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:15.4001822Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T10:32:15.4010889Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:15.4027025Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:15.4038266Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:15.4048458Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T10:32:15.4057260Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:15.4071415Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T10:32:15.4084445Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:15.4100394Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:15.4109821Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:15.4121901Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T10:32:15.4148453Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4166444Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4183698Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4201508Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4218780Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4232032Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4270820Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4271409Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4286080Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4299665Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4313114Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4327864Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4343195Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4355223Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4370149Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4383387Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4398946Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4412461Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4425796Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4440174Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4454650Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4469966Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4483322Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4496114Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4509352Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4526382Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4539158Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4552295Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4565784Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4578364Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4591462Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4609524Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4626563Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4646184Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4664167Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4678289Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4693167Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4711319Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4727916Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4741468Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4756058Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4770551Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4784052Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4796351Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4811675Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4824999Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4842521Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4856467Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4870055Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4883961Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4897883Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4910965Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4926534Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4940225Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4955579Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4973802Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.4989046Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5004190Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5023905Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5039840Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5059613Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5073656Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5088037Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5102285Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5116442Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5129505Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5144427Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5158732Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5175430Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5189212Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5210129Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5224464Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5238492Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5259118Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5279370Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5295714Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5311333Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5328765Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5342770Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5356592Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5370615Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:15.5386508Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T10:32:15.5413105Z ##[endgroup] 2025-12-04T10:32:15.5413287Z ##[group]Fetching the repository 2025-12-04T10:32:15.5420555Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T10:32:19.3642732Z From https://github.com/pytorch/pytorch 2025-12-04T10:32:19.3643337Z * [new branch] 2.6.0.dev20241004+ -> origin/2.6.0.dev20241004+ 2025-12-04T10:32:19.3643830Z * [new branch] 2.9.1 -> origin/2.9.1 2025-12-04T10:32:19.3644414Z * [new branch] AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest 2025-12-04T10:32:19.3645032Z * [new branch] Flamefire-patch-1 -> origin/Flamefire-patch-1 2025-12-04T10:32:19.3645683Z * [new branch] HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes 2025-12-04T10:32:19.3646250Z * [new branch] HOPrintFunc -> origin/HOPrintFunc 2025-12-04T10:32:19.3646745Z * [new branch] IvanKobzarev/stack/1 -> origin/IvanKobzarev/stack/1 2025-12-04T10:32:19.3647264Z * [new branch] NicoshevSVE128 -> origin/NicoshevSVE128 2025-12-04T10:32:19.3647773Z * [new branch] PR-AOTInductorNoneBug -> origin/PR-AOTInductorNoneBug 2025-12-04T10:32:19.3648332Z * [new branch] PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix 2025-12-04T10:32:19.3648884Z * [new branch] PR-FixConfigsIssue -> origin/PR-FixConfigsIssue 2025-12-04T10:32:19.3649398Z * [new branch] PR-NoneBugFix-viable -> origin/PR-NoneBugFix-viable 2025-12-04T10:32:19.3649974Z * [new branch] PR-ResetToZero -> origin/PR-ResetToZero 2025-12-04T10:32:19.3650987Z * [new branch] Update-Flash-Packaging -> origin/Update-Flash-Packaging 2025-12-04T10:32:19.3651449Z * [new branch] VLA_exp -> origin/VLA_exp 2025-12-04T10:32:19.3651740Z * [new branch] activation_bench -> origin/activation_bench 2025-12-04T10:32:19.3652023Z * [new branch] addmm-heuristic -> origin/addmm-heuristic 2025-12-04T10:32:19.3652291Z * [new branch] adi/onednn_aarch64 -> origin/adi/onednn_aarch64 2025-12-04T10:32:19.3652562Z * [new branch] adi/test -> origin/adi/test 2025-12-04T10:32:19.3652818Z * [new branch] adi/test_bgemm -> origin/adi/test_bgemm 2025-12-04T10:32:19.3653073Z * [new branch] adi/test_m8g -> origin/adi/test_m8g 2025-12-04T10:32:19.3653333Z * [new branch] adi/test_onednn -> origin/adi/test_onednn 2025-12-04T10:32:19.3653609Z * [new branch] adi/test_onednn_v3.9 -> origin/adi/test_onednn_v3.9 2025-12-04T10:32:19.3653877Z * [new branch] adi/test_presve_change -> origin/adi/test_presve_change 2025-12-04T10:32:19.3654147Z * [new branch] adi/test_timm -> origin/adi/test_timm 2025-12-04T10:32:19.3654541Z * [new branch] adi/testpresve_change -> origin/adi/testpresve_change 2025-12-04T10:32:19.3654837Z * [new branch] aditew01/test/vec_bf16 -> origin/aditew01/test/vec_bf16 2025-12-04T10:32:19.3655140Z * [new branch] ah-globalfeedback-hook -> origin/ah-globalfeedback-hook 2025-12-04T10:32:19.3655576Z * [new branch] albanD-patch-1 -> origin/albanD-patch-1 2025-12-04T10:32:19.3655846Z * [new branch] also-surround-shimh -> origin/also-surround-shimh 2025-12-04T10:32:19.3656126Z * [new branch] angelayi/aot_compile -> origin/angelayi/aot_compile 2025-12-04T10:32:19.3656458Z * [new branch] angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files 2025-12-04T10:32:19.3656772Z * [new branch] angelayi/benchmark -> origin/angelayi/benchmark 2025-12-04T10:32:19.3657103Z * [new branch] angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization 2025-12-04T10:32:19.3657463Z * [new branch] angelayi/cpp_loader -> origin/angelayi/cpp_loader 2025-12-04T10:32:19.3657760Z * [new branch] angelayi/inductor_const -> origin/angelayi/inductor_const 2025-12-04T10:32:19.3658026Z * [new branch] angelayi/lstm -> origin/angelayi/lstm 2025-12-04T10:32:19.3658289Z * [new branch] angelayi/no_so_weight -> origin/angelayi/no_so_weight 2025-12-04T10:32:19.3658569Z * [new branch] angelayi/scan_layers -> origin/angelayi/scan_layers 2025-12-04T10:32:19.3658845Z * [new branch] angelayi/side_eff -> origin/angelayi/side_eff 2025-12-04T10:32:19.3659119Z * [new branch] angelayi/state_dict -> origin/angelayi/state_dict 2025-12-04T10:32:19.3659395Z * [new branch] angelayi/symint_input -> origin/angelayi/symint_input 2025-12-04T10:32:19.3659709Z * [new branch] angelayi/symm_mem -> origin/angelayi/symm_mem 2025-12-04T10:32:19.3659977Z * [new branch] angelayi/test_cpp -> origin/angelayi/test_cpp 2025-12-04T10:32:19.3660263Z * [new branch] angelayi/torch_size -> origin/angelayi/torch_size 2025-12-04T10:32:19.3660526Z * [new branch] annotate_assert -> origin/annotate_assert 2025-12-04T10:32:19.3660795Z * [new branch] annotate_fallback_kernel -> origin/annotate_fallback_kernel 2025-12-04T10:32:19.3661085Z * [new branch] annotation_deepcopy -> origin/annotation_deepcopy 2025-12-04T10:32:19.3661359Z * [new branch] annotation_dynamo -> origin/annotation_dynamo 2025-12-04T10:32:19.3661654Z * [new branch] aot_eager_stack_trace -> origin/aot_eager_stack_trace 2025-12-04T10:32:19.3661860Z * [new branch] aoti-cuda-alloc -> origin/aoti-cuda-alloc 2025-12-04T10:32:19.3662065Z * [new branch] aoti_const_device -> origin/aoti_const_device 2025-12-04T10:32:19.3662273Z * [new branch] aoti_fqn_name_interface -> origin/aoti_fqn_name_interface 2025-12-04T10:32:19.3662519Z * [new branch] aoti_package_weights_binary -> origin/aoti_package_weights_binary 2025-12-04T10:32:19.3662747Z * [new branch] aoti_target_windows -> origin/aoti_target_windows 2025-12-04T10:32:19.3663001Z * [new branch] arsh/feat/inductor_check_profiling -> origin/arsh/feat/inductor_check_profiling 2025-12-04T10:32:19.3663246Z * [new branch] async_tp -> origin/async_tp 2025-12-04T10:32:19.3663488Z * [new branch] atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124 2025-12-04T10:32:19.3663768Z * [new branch] atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1 2025-12-04T10:32:19.3664019Z * [new branch] atalman-patch-2 -> origin/atalman-patch-2 2025-12-04T10:32:19.3664228Z * [new branch] atalman-patch-3 -> origin/atalman-patch-3 2025-12-04T10:32:19.3664467Z * [new branch] atalman-patch-4 -> origin/atalman-patch-4 2025-12-04T10:32:19.3664670Z * [new branch] atalman-patch-5 -> origin/atalman-patch-5 2025-12-04T10:32:19.3664872Z * [new branch] atalman-patch-6 -> origin/atalman-patch-6 2025-12-04T10:32:19.3665070Z * [new branch] atalman-patch-7 -> origin/atalman-patch-7 2025-12-04T10:32:19.3665275Z * [new branch] atalman-patch-8 -> origin/atalman-patch-8 2025-12-04T10:32:19.3665505Z * [new branch] atalman_inductor_2.3.1 -> origin/atalman_inductor_2.3.1 2025-12-04T10:32:19.3665725Z * [new branch] atalman_inductor_2.4.0 -> origin/atalman_inductor_2.4.0 2025-12-04T10:32:19.3665951Z * [new branch] atalman_inductor_2.4.x -> origin/atalman_inductor_2.4.x 2025-12-04T10:32:19.3666192Z * [new branch] attention_benchmarking_clean -> origin/attention_benchmarking_clean 2025-12-04T10:32:19.3666441Z * [new branch] bahuang/dt_fix_scalar_add -> origin/bahuang/dt_fix_scalar_add 2025-12-04T10:32:19.3666675Z * [new branch] bahuang/fix_debug_mode -> origin/bahuang/fix_debug_mode 2025-12-04T10:32:19.3666884Z * [new branch] bahuang/fix_expand -> origin/bahuang/fix_expand 2025-12-04T10:32:19.3667090Z * [new branch] bahuang/test -> origin/bahuang/test 2025-12-04T10:32:19.3667280Z * [new branch] base/1.5 -> origin/base/1.5 2025-12-04T10:32:19.3667515Z * [new branch] batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention 2025-12-04T10:32:19.3667765Z * [new branch] bench_scaled_mm_ops -> origin/bench_scaled_mm_ops 2025-12-04T10:32:19.3667977Z * [new branch] benchmark-updates -> origin/benchmark-updates 2025-12-04T10:32:19.3668198Z * [new branch] benchmarking-script -> origin/benchmarking-script 2025-12-04T10:32:19.3668412Z * [new branch] bertmaher/pinbump26 -> origin/bertmaher/pinbump26 2025-12-04T10:32:19.3668626Z * [new branch] bertrand/cutlass -> origin/bertrand/cutlass 2025-12-04T10:32:19.3668831Z * [new branch] bf/bug-static-input -> origin/bf/bug-static-input 2025-12-04T10:32:19.3669034Z * [new branch] bf/cg-backend -> origin/bf/cg-backend 2025-12-04T10:32:19.3669234Z * [new branch] bf/cg-nccl-test -> origin/bf/cg-nccl-test 2025-12-04T10:32:19.3669463Z * [new branch] bf/cg-remove-check -> origin/bf/cg-remove-check 2025-12-04T10:32:19.3669723Z * [new branch] bf/clean-torchbench-hf -> origin/bf/clean-torchbench-hf 2025-12-04T10:32:19.3669947Z * [new branch] bf/combo-debug-log -> origin/bf/combo-debug-log 2025-12-04T10:32:19.3670148Z * [new branch] bf/cudagraph -> origin/bf/cudagraph 2025-12-04T10:32:19.3670415Z * [new branch] bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation 2025-12-04T10:32:19.3670832Z * [new branch] bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark 2025-12-04T10:32:19.3671183Z * [new branch] bf/cudagraph-partition -> origin/bf/cudagraph-partition 2025-12-04T10:32:19.3671405Z * [new branch] bf/donated-buffer-bench -> origin/bf/donated-buffer-bench 2025-12-04T10:32:19.3671609Z * [new branch] bf/dynamo-partition -> origin/bf/dynamo-partition 2025-12-04T10:32:19.3671781Z * [new branch] bf/lite -> origin/bf/lite 2025-12-04T10:32:19.3671962Z * [new branch] bf/pa-non-divisible -> origin/bf/pa-non-divisible 2025-12-04T10:32:19.3672184Z * [new branch] bf/partition-cache-free-symbols -> origin/bf/partition-cache-free-symbols 2025-12-04T10:32:19.3672465Z * [new branch] bf/partition-memory-plan -> origin/bf/partition-memory-plan 2025-12-04T10:32:19.3672669Z * [new branch] bf/partition-move-cpu -> origin/bf/partition-move-cpu 2025-12-04T10:32:19.3672878Z * [new branch] bf/partition-view-fallback -> origin/bf/partition-view-fallback 2025-12-04T10:32:19.3673096Z * [new branch] bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d 2025-12-04T10:32:19.3673292Z * [new branch] bf/timm-nov-26-2025 -> origin/bf/timm-nov-26-2025 2025-12-04T10:32:19.3673496Z * [new branch] bf/transformer-pin-4-57-3 -> origin/bf/transformer-pin-4-57-3 2025-12-04T10:32:19.3673715Z * [new branch] bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492 2025-12-04T10:32:19.3673939Z * [new branch] bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb 2025-12-04T10:32:19.3674158Z * [new branch] bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129 2025-12-04T10:32:19.3674368Z * [new branch] bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d 2025-12-04T10:32:19.3674573Z * [new branch] bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e 2025-12-04T10:32:19.3674779Z * [new branch] bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c 2025-12-04T10:32:19.3675002Z * [new branch] bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c 2025-12-04T10:32:19.3675220Z * [new branch] bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f 2025-12-04T10:32:19.3675429Z * [new branch] bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0 2025-12-04T10:32:19.3675643Z * [new branch] bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149 2025-12-04T10:32:19.3675855Z * [new branch] bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a 2025-12-04T10:32:19.3676060Z * [new branch] bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b 2025-12-04T10:32:19.3676271Z * [new branch] bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new 2025-12-04T10:32:19.3676488Z * [new branch] bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8 2025-12-04T10:32:19.3676696Z * [new branch] bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2 2025-12-04T10:32:19.3676943Z * [new branch] bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563 2025-12-04T10:32:19.3677149Z * [new branch] brister/fx_device_type -> origin/brister/fx_device_type 2025-12-04T10:32:19.3677362Z * [new branch] brister/test_inductor_all_fx -> origin/brister/test_inductor_all_fx 2025-12-04T10:32:19.3677616Z * [new branch] brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check 2025-12-04T10:32:19.3677844Z * [new branch] bwd-backup -> origin/bwd-backup 2025-12-04T10:32:19.3678010Z * [new branch] c57382a49 -> origin/c57382a49 2025-12-04T10:32:19.3678174Z * [new branch] ca_0431d47eaa -> origin/ca_0431d47eaa 2025-12-04T10:32:19.3678345Z * [new branch] ca_fix_0431d47eaa -> origin/ca_fix_0431d47eaa 2025-12-04T10:32:19.3678548Z * [new branch] camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push 2025-12-04T10:32:19.3678758Z * [new branch] cccclai-patch-1 -> origin/cccclai-patch-1 2025-12-04T10:32:19.3679000Z * [new branch] cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3679328Z * [new branch] cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3679656Z * [new branch] cherry-pick-162208-by-pytorch_bot_bot_ -> origin/cherry-pick-162208-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3679937Z * [new branch] cherry-pick-163169-by-pytorch_bot_bot_ -> origin/cherry-pick-163169-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3680212Z * [new branch] cherry-pick-165086-by-pytorch_bot_bot_ -> origin/cherry-pick-165086-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3680486Z * [new branch] cherry-pick-165514-by-pytorch_bot_bot_ -> origin/cherry-pick-165514-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3680768Z * [new branch] cherry-pick-165601-by-pytorch_bot_bot_ -> origin/cherry-pick-165601-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3681040Z * [new branch] cherry-pick-165667-by-pytorch_bot_bot_ -> origin/cherry-pick-165667-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3681323Z * [new branch] cherry-pick-165815-by-pytorch_bot_bot_ -> origin/cherry-pick-165815-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3681600Z * [new branch] cherry-pick-165922-by-pytorch_bot_bot_ -> origin/cherry-pick-165922-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3681869Z * [new branch] cherry-pick-166148-by-pytorch_bot_bot_ -> origin/cherry-pick-166148-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3682143Z * [new branch] cherry-pick-166181-by-pytorch_bot_bot_ -> origin/cherry-pick-166181-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3682421Z * [new branch] cherry-pick-166404-by-pytorch_bot_bot_ -> origin/cherry-pick-166404-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3682698Z * [new branch] cherry-pick-166427-by-pytorch_bot_bot_ -> origin/cherry-pick-166427-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3682970Z * [new branch] cherry-pick-166480-by-pytorch_bot_bot_ -> origin/cherry-pick-166480-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3683249Z * [new branch] cherry-pick-166570-by-pytorch_bot_bot_ -> origin/cherry-pick-166570-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3683520Z * [new branch] cherry-pick-166993-by-pytorch_bot_bot_ -> origin/cherry-pick-166993-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3683793Z * [new branch] cherry-pick-167111-by-pytorch_bot_bot_ -> origin/cherry-pick-167111-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3684066Z * [new branch] cherry-pick-167478-by-pytorch_bot_bot_ -> origin/cherry-pick-167478-by-pytorch_bot_bot_ 2025-12-04T10:32:19.3684301Z * [new branch] cherry_pick_166036_166040 -> origin/cherry_pick_166036_166040 2025-12-04T10:32:19.3684540Z * [new branch] cherry_pick_166457 -> origin/cherry_pick_166457 2025-12-04T10:32:19.3684752Z * [new branch] cherrypick_166338 -> origin/cherrypick_166338 2025-12-04T10:32:19.3684934Z * [new branch] cherrypick_166458 -> origin/cherrypick_166458 2025-12-04T10:32:19.3685114Z * [new branch] cherrypick_166586 -> origin/cherrypick_166586 2025-12-04T10:32:19.3685298Z * [new branch] cherrypick_166956 -> origin/cherrypick_166956 2025-12-04T10:32:19.3685470Z * [new branch] ci_attn -> origin/ci_attn 2025-12-04T10:32:19.3685635Z * [new branch] codex-testing -> origin/codex-testing 2025-12-04T10:32:19.3685897Z * [new branch] codex/add-check_memory_overlap-helper-functions -> origin/codex/add-check_memory_overlap-helper-functions 2025-12-04T10:32:19.3686205Z * [new branch] codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch 2025-12-04T10:32:19.3686518Z * [new branch] codex/investigate-segfaults-in-get_tensor_storage_id -> origin/codex/investigate-segfaults-in-get_tensor_storage_id 2025-12-04T10:32:19.3686887Z * [new branch] codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run 2025-12-04T10:32:19.3687196Z * [new branch] compatiblpy39util -> origin/compatiblpy39util 2025-12-04T10:32:19.3687377Z * [new branch] cond_hop_device -> origin/cond_hop_device 2025-12-04T10:32:19.3687552Z * [new branch] context_test -> origin/context_test 2025-12-04T10:32:19.3687789Z * [new branch] copilot/code-style-cleanup-python-pip -> origin/copilot/code-style-cleanup-python-pip 2025-12-04T10:32:19.3688033Z * [new branch] cpio/fix_new_ami_tests -> origin/cpio/fix_new_ami_tests 2025-12-04T10:32:19.3688256Z * [new branch] cpp-docs-dependency-upgrade -> origin/cpp-docs-dependency-upgrade 2025-12-04T10:32:19.3688509Z * [new branch] crpa/typo-in-inductor_comm_lowering -> origin/crpa/typo-in-inductor_comm_lowering 2025-12-04T10:32:19.3688734Z * [new branch] csl/always_produce_xml -> origin/csl/always_produce_xml 2025-12-04T10:32:19.3688949Z * [new branch] csl/build_test_more_procs -> origin/csl/build_test_more_procs 2025-12-04T10:32:19.3689160Z * [new branch] csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2 2025-12-04T10:32:19.3689347Z * [new branch] csl/clean_up -> origin/csl/clean_up 2025-12-04T10:32:19.3689542Z * [new branch] csl/fix_retry_segfault_exit -> origin/csl/fix_retry_segfault_exit 2025-12-04T10:32:19.3689766Z * [new branch] csl/katex -> origin/csl/katex 2025-12-04T10:32:19.3689935Z * [new branch] csl/larger_runner -> origin/csl/larger_runner 2025-12-04T10:32:19.3690121Z * [new branch] csl/lint_testing -> origin/csl/lint_testing 2025-12-04T10:32:19.3690293Z * [new branch] csl/lint_thing -> origin/csl/lint_thing 2025-12-04T10:32:19.3690478Z * [new branch] csl/lintrunner_stuff -> origin/csl/lintrunner_stuff 2025-12-04T10:32:19.3690674Z * [new branch] csl/manually_gen_json -> origin/csl/manually_gen_json 2025-12-04T10:32:19.3690854Z * [new branch] csl/mps_sharding -> origin/csl/mps_sharding 2025-12-04T10:32:19.3691044Z * [new branch] csl/multistage_docker -> origin/csl/multistage_docker 2025-12-04T10:32:19.3691231Z * [new branch] csl/print_timing -> origin/csl/print_timing 2025-12-04T10:32:19.3691411Z * [new branch] csl/remove_experiment -> origin/csl/remove_experiment 2025-12-04T10:32:19.3691653Z * [new branch] csl/remove_maybe_unused_var -> origin/csl/remove_maybe_unused_var 2025-12-04T10:32:19.3691888Z * [new branch] csl/remove_repo_specific_autolabel -> origin/csl/remove_repo_specific_autolabel 2025-12-04T10:32:19.3692115Z * [new branch] csl/remove_run_parallel -> origin/csl/remove_run_parallel 2025-12-04T10:32:19.3692311Z * [new branch] csl/remove_unused_vars -> origin/csl/remove_unused_vars 2025-12-04T10:32:19.3692501Z * [new branch] csl/revert_open -> origin/csl/revert_open 2025-12-04T10:32:19.3692675Z * [new branch] csl/skip_build -> origin/csl/skip_build 2025-12-04T10:32:19.3692875Z * [new branch] csl/smaller_avx_amx_runenrs -> origin/csl/smaller_avx_amx_runenrs 2025-12-04T10:32:19.3693070Z * [new branch] csl/td_job_level -> origin/csl/td_job_level 2025-12-04T10:32:19.3693274Z * [new branch] csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner 2025-12-04T10:32:19.3693526Z * [new branch] csl/test_owners_autograd_dispatch_nn -> origin/csl/test_owners_autograd_dispatch_nn 2025-12-04T10:32:19.3693778Z * [new branch] csl/test_owners_higher_confidence -> origin/csl/test_owners_higher_confidence 2025-12-04T10:32:19.3693996Z * [new branch] csl/upload_json_running -> origin/csl/upload_json_running 2025-12-04T10:32:19.3694223Z * [new branch] csl/win_sccache -> origin/csl/win_sccache 2025-12-04T10:32:19.3694395Z * [new branch] csl/xml_stuff -> origin/csl/xml_stuff 2025-12-04T10:32:19.3694564Z * [new branch] cublasrelax2 -> origin/cublasrelax2 2025-12-04T10:32:19.3694736Z * [new branch] cuda_mempool -> origin/cuda_mempool 2025-12-04T10:32:19.3694920Z * [new branch] custom_lowering_dict -> origin/custom_lowering_dict 2025-12-04T10:32:19.3695115Z * [new branch] d4l3k/debug_plane_frtrace -> origin/d4l3k/debug_plane_frtrace 2025-12-04T10:32:19.3695305Z * [new branch] daxia6/2.8o3 -> origin/daxia6/2.8o3 2025-12-04T10:32:19.3695476Z * [new branch] debug-guard -> origin/debug-guard 2025-12-04T10:32:19.3695657Z * [new branch] delete-quant-docs -> origin/delete-quant-docs 2025-12-04T10:32:19.3695991Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 2025-12-04T10:32:19.3696456Z * [new branch] dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 2025-12-04T10:32:19.3696791Z * [new branch] desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper 2025-12-04T10:32:19.3697040Z * [new branch] desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64 2025-12-04T10:32:19.3697277Z * [new branch] dev/dhruva/flex_attn_opt -> origin/dev/dhruva/flex_attn_opt 2025-12-04T10:32:19.3697479Z * [new branch] dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd 2025-12-04T10:32:19.3697676Z * [new branch] dev/joona/Unranked -> origin/dev/joona/Unranked 2025-12-04T10:32:19.3697858Z * [new branch] dev/joona/cat -> origin/dev/joona/cat 2025-12-04T10:32:19.3698043Z * [new branch] dev/joona/embeddingbag -> origin/dev/joona/embeddingbag 2025-12-04T10:32:19.3698249Z * [new branch] dev/joona/fix_sdpa_memtest -> origin/dev/joona/fix_sdpa_memtest 2025-12-04T10:32:19.3698467Z * [new branch] dev/joona/getTensorsString -> origin/dev/joona/getTensorsString 2025-12-04T10:32:19.3698693Z * [new branch] dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14 2025-12-04T10:32:19.3698935Z * [new branch] dev/joona/scalar_clamp -> origin/dev/joona/scalar_clamp 2025-12-04T10:32:19.3699117Z * [new branch] dev/joona/sdpa -> origin/dev/joona/sdpa 2025-12-04T10:32:19.3699299Z * [new branch] dev/joona/sdpa_api -> origin/dev/joona/sdpa_api 2025-12-04T10:32:19.3699484Z * [new branch] dev/joona/type_inf -> origin/dev/joona/type_inf 2025-12-04T10:32:19.3699783Z * [new branch] dev/joona/ulpAssertClose -> origin/dev/joona/ulpAssertClose 2025-12-04T10:32:19.3699980Z * [new branch] dev/joona/upsize3d -> origin/dev/joona/upsize3d 2025-12-04T10:32:19.3700161Z * [new branch] disp_counter -> origin/disp_counter 2025-12-04T10:32:19.3700339Z * [new branch] divyanshk-patch-1 -> origin/divyanshk-patch-1 2025-12-04T10:32:19.3700516Z * [new branch] docs -> origin/docs 2025-12-04T10:32:19.3700688Z * [new branch] documentation -> origin/documentation 2025-12-04T10:32:19.3700872Z * [new branch] eager_model_benchmarks -> origin/eager_model_benchmarks 2025-12-04T10:32:19.3701084Z * [new branch] embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control 2025-12-04T10:32:19.3701309Z * [new branch] embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B 2025-12-04T10:32:19.3701566Z * [new branch] embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B 2025-12-04T10:32:19.3701765Z * [new branch] eqy-patch-1 -> origin/eqy-patch-1 2025-12-04T10:32:19.3701939Z * [new branch] eqy-patch-2 -> origin/eqy-patch-2 2025-12-04T10:32:19.3702105Z * [new branch] eqy-patch-3 -> origin/eqy-patch-3 2025-12-04T10:32:19.3702270Z * [new branch] eqy-patch-4 -> origin/eqy-patch-4 2025-12-04T10:32:19.3702439Z * [new branch] eqy-patch-5 -> origin/eqy-patch-5 2025-12-04T10:32:19.3702603Z * [new branch] eqy-patch-6 -> origin/eqy-patch-6 2025-12-04T10:32:19.3702788Z * [new branch] exclamaforte/amd-ma -> origin/exclamaforte/amd-ma 2025-12-04T10:32:19.3703019Z * [new branch] exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run 2025-12-04T10:32:19.3703278Z * [new branch] exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor 2025-12-04T10:32:19.3703534Z * [new branch] exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion 2025-12-04T10:32:19.3703812Z * [new branch] exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning 2025-12-04T10:32:19.3704105Z * [new branch] exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg 2025-12-04T10:32:19.3704413Z * [new branch] exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run 2025-12-04T10:32:19.3704677Z * [new branch] exclamaforte/fusion-data -> origin/exclamaforte/fusion-data 2025-12-04T10:32:19.3704908Z * [new branch] exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run 2025-12-04T10:32:19.3705160Z * [new branch] exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model 2025-12-04T10:32:19.3705380Z * [new branch] exclamaforte/gemm-model -> origin/exclamaforte/gemm-model 2025-12-04T10:32:19.3705650Z * [new branch] exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection 2025-12-04T10:32:19.3705920Z * [new branch] exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd 2025-12-04T10:32:19.3706140Z * [new branch] exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model 2025-12-04T10:32:19.3706458Z * [new branch] exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor 2025-12-04T10:32:19.3706734Z * [new branch] exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo 2025-12-04T10:32:19.3706993Z * [new branch] exclamaforte/profiler-visualization -> origin/exclamaforte/profiler-visualization 2025-12-04T10:32:19.3707260Z * [new branch] exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode 2025-12-04T10:32:19.3707532Z * [new branch] exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs 2025-12-04T10:32:19.3707816Z * [new branch] exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2 2025-12-04T10:32:19.3708050Z * [new branch] exec -> origin/exec 2025-12-04T10:32:19.3708235Z * [new branch] experimental-mosaic -> origin/experimental-mosaic 2025-12-04T10:32:19.3708428Z * [new branch] export-D61047529 -> origin/export-D61047529 2025-12-04T10:32:19.3708674Z * [new branch] export-D71412006 -> origin/export-D71412006 2025-12-04T10:32:19.3708857Z * [new branch] export-D73042989 -> origin/export-D73042989 2025-12-04T10:32:19.3709063Z * [new branch] export-D78957093 -> origin/export-D78957093 2025-12-04T10:32:19.3709239Z * [new branch] export-D78996107 -> origin/export-D78996107 2025-12-04T10:32:19.3709412Z * [new branch] export-D80823877 -> origin/export-D80823877 2025-12-04T10:32:19.3709646Z * [new branch] export-D80958642 -> origin/export-D80958642 2025-12-04T10:32:19.3709824Z * [new branch] export-D81054193 -> origin/export-D81054193 2025-12-04T10:32:19.3710001Z * [new branch] export-D81204584 -> origin/export-D81204584 2025-12-04T10:32:19.3710180Z * [new branch] export-D81429090 -> origin/export-D81429090 2025-12-04T10:32:19.3710353Z * [new branch] export-D82250826 -> origin/export-D82250826 2025-12-04T10:32:19.3710527Z * [new branch] export-D82253817 -> origin/export-D82253817 2025-12-04T10:32:19.3710700Z * [new branch] export-D83541846 -> origin/export-D83541846 2025-12-04T10:32:19.3710872Z * [new branch] export-D83627170 -> origin/export-D83627170 2025-12-04T10:32:19.3711042Z * [new branch] export-D83766701 -> origin/export-D83766701 2025-12-04T10:32:19.3711210Z * [new branch] export-D83768878 -> origin/export-D83768878 2025-12-04T10:32:19.3711385Z * [new branch] export-D83769447 -> origin/export-D83769447 2025-12-04T10:32:19.3711553Z * [new branch] export-D84089824 -> origin/export-D84089824 2025-12-04T10:32:19.3711726Z * [new branch] export-D84213020 -> origin/export-D84213020 2025-12-04T10:32:19.3711897Z * [new branch] export-D84373821 -> origin/export-D84373821 2025-12-04T10:32:19.3712070Z * [new branch] export-D84612194 -> origin/export-D84612194 2025-12-04T10:32:19.3712246Z * [new branch] export-D84890985 -> origin/export-D84890985 2025-12-04T10:32:19.3712418Z * [new branch] export-D85122326 -> origin/export-D85122326 2025-12-04T10:32:19.3712589Z * [new branch] export-D86256198 -> origin/export-D86256198 2025-12-04T10:32:19.3712767Z * [new branch] export-D86460608 -> origin/export-D86460608 2025-12-04T10:32:19.3712940Z * [new branch] export-D86474796 -> origin/export-D86474796 2025-12-04T10:32:19.3713107Z * [new branch] export-D86712396 -> origin/export-D86712396 2025-12-04T10:32:19.3713329Z * [new branch] export-D87022129 -> origin/export-D87022129 2025-12-04T10:32:19.3713502Z * [new branch] export-D87838959 -> origin/export-D87838959 2025-12-04T10:32:19.3713674Z * [new branch] export-D88319437 -> origin/export-D88319437 2025-12-04T10:32:19.3713902Z * [new branch] exported-model-train-idempotent -> origin/exported-model-train-idempotent 2025-12-04T10:32:19.3714142Z * [new branch] ezyang-titan-october -> origin/ezyang-titan-october 2025-12-04T10:32:19.3714338Z * [new branch] ezyang-titan-october2 -> origin/ezyang-titan-october2 2025-12-04T10:32:19.3714524Z * [new branch] ezyang-war -> origin/ezyang-war 2025-12-04T10:32:19.3714725Z * [new branch] ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors 2025-12-04T10:32:19.3714920Z * [new branch] fa_u8_brgemm -> origin/fa_u8_brgemm 2025-12-04T10:32:19.3715117Z * [new branch] fadeputr/sequence_fbgemm -> origin/fadeputr/sequence_fbgemm 2025-12-04T10:32:19.3715320Z * [new branch] fastmath_baseline -> origin/fastmath_baseline 2025-12-04T10:32:19.3715492Z * [new branch] fbcode/warm -> origin/fbcode/warm 2025-12-04T10:32:19.3715657Z * [new branch] fca -> origin/fca 2025-12-04T10:32:19.3715852Z * [new branch] fca2_ca5984c -> origin/fca2_ca5984c 2025-12-04T10:32:19.3716205Z * [new branch] fca5 -> origin/fca5 2025-12-04T10:32:19.3716384Z * [new branch] feature/justknobs-cpp -> origin/feature/justknobs-cpp 2025-12-04T10:32:19.3716581Z * [new branch] feature/numa-forkserver -> origin/feature/numa-forkserver 2025-12-04T10:32:19.3716776Z * [new branch] ffast_math_baseline -> origin/ffast_math_baseline 2025-12-04T10:32:19.3716969Z * [new branch] ffast_math_target -> origin/ffast_math_target 2025-12-04T10:32:19.3717151Z * [new branch] findhao/base_commit -> origin/findhao/base_commit 2025-12-04T10:32:19.3717342Z * [new branch] findhao/base_commit1 -> origin/findhao/base_commit1 2025-12-04T10:32:19.3717535Z * [new branch] findhao/multistream2 -> origin/findhao/multistream2 2025-12-04T10:32:19.3717726Z * [new branch] findhao/multistream5 -> origin/findhao/multistream5 2025-12-04T10:32:19.3717920Z * [new branch] findhao/multistream6 -> origin/findhao/multistream6 2025-12-04T10:32:19.3718117Z * [new branch] findhao/operatorbench3 -> origin/findhao/operatorbench3 2025-12-04T10:32:19.3718321Z * [new branch] findhao/operatorbench5 -> origin/findhao/operatorbench5 2025-12-04T10:32:19.3718519Z * [new branch] findhao/tritonparse -> origin/findhao/tritonparse 2025-12-04T10:32:19.3718740Z * [new branch] fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format 2025-12-04T10:32:19.3718947Z * [new branch] fix-config-ignore -> origin/fix-config-ignore 2025-12-04T10:32:19.3719129Z * [new branch] fix-dict-guard -> origin/fix-dict-guard 2025-12-04T10:32:19.3719305Z * [new branch] fix_addmm_issue -> origin/fix_addmm_issue 2025-12-04T10:32:19.3719505Z * [new branch] fix_amd_missing_cluster_dims -> origin/fix_amd_missing_cluster_dims 2025-12-04T10:32:19.3719754Z * [new branch] fix_bench_bwd_pass -> origin/fix_bench_bwd_pass 2025-12-04T10:32:19.3719944Z * [new branch] fix_mem_profiler_config -> origin/fix_mem_profiler_config 2025-12-04T10:32:19.3720241Z * [new branch] fix_nvrtc_discovery -> origin/fix_nvrtc_discovery 2025-12-04T10:32:19.3720419Z * [new branch] fix_op_runner -> origin/fix_op_runner 2025-12-04T10:32:19.3720641Z * [new branch] fix_ubn_159469 -> origin/fix_ubn_159469 2025-12-04T10:32:19.3720814Z * [new branch] fixes-triage -> origin/fixes-triage 2025-12-04T10:32:19.3720987Z * [new branch] fixflashinfer -> origin/fixflashinfer 2025-12-04T10:32:19.3721164Z * [new branch] flash_decoding_cpu -> origin/flash_decoding_cpu 2025-12-04T10:32:19.3721351Z * [new branch] flex-flash -> origin/flex-flash 2025-12-04T10:32:19.3721560Z * [new branch] flex_attention_functorch_grad -> origin/flex_attention_functorch_grad 2025-12-04T10:32:19.3721755Z * [new branch] flex_flash -> origin/flex_flash 2025-12-04T10:32:19.3721961Z * [new branch] fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule 2025-12-04T10:32:19.3722209Z * [new branch] fmassa/tests_comm_compute_scheduler -> origin/fmassa/tests_comm_compute_scheduler 2025-12-04T10:32:19.3722430Z * [new branch] forkserver_fix -> origin/forkserver_fix 2025-12-04T10:32:19.3722610Z * [new branch] fsdp2_trace_rules -> origin/fsdp2_trace_rules 2025-12-04T10:32:19.3722784Z * [new branch] fx_cpp -> origin/fx_cpp 2025-12-04T10:32:19.3722946Z * [new branch] fy/fix-win -> origin/fy/fix-win 2025-12-04T10:32:19.3723156Z * [new branch] galv-patch-1 -> origin/galv-patch-1 2025-12-04T10:32:19.3723388Z * [new branch] galv/cudagraphs-conditional-nodes-4 -> origin/galv/cudagraphs-conditional-nodes-4 2025-12-04T10:32:19.3723645Z * [new branch] georgehong/cmakelists-patch -> origin/georgehong/cmakelists-patch 2025-12-04T10:32:19.3723859Z * [new branch] gh/AlnisM/1/base -> origin/gh/AlnisM/1/base 2025-12-04T10:32:19.3724042Z * [new branch] gh/AlnisM/1/head -> origin/gh/AlnisM/1/head 2025-12-04T10:32:19.3724228Z * [new branch] gh/EikanWang/67/base -> origin/gh/EikanWang/67/base 2025-12-04T10:32:19.3724430Z * [new branch] gh/EikanWang/67/head -> origin/gh/EikanWang/67/head 2025-12-04T10:32:19.3724622Z * [new branch] gh/Gasoonjia/1/base -> origin/gh/Gasoonjia/1/base 2025-12-04T10:32:19.3724813Z * [new branch] gh/Gasoonjia/1/head -> origin/gh/Gasoonjia/1/head 2025-12-04T10:32:19.3725001Z * [new branch] gh/H-Huang/131/base -> origin/gh/H-Huang/131/base 2025-12-04T10:32:19.3725185Z * [new branch] gh/H-Huang/131/head -> origin/gh/H-Huang/131/head 2025-12-04T10:32:19.3725366Z * [new branch] gh/H-Huang/131/orig -> origin/gh/H-Huang/131/orig 2025-12-04T10:32:19.3725547Z * [new branch] gh/H-Huang/132/base -> origin/gh/H-Huang/132/base 2025-12-04T10:32:19.3725723Z * [new branch] gh/H-Huang/132/head -> origin/gh/H-Huang/132/head 2025-12-04T10:32:19.3725906Z * [new branch] gh/H-Huang/132/orig -> origin/gh/H-Huang/132/orig 2025-12-04T10:32:19.3726089Z * [new branch] gh/H-Huang/180/base -> origin/gh/H-Huang/180/base 2025-12-04T10:32:19.3726265Z * [new branch] gh/H-Huang/180/head -> origin/gh/H-Huang/180/head 2025-12-04T10:32:19.3726449Z * [new branch] gh/H-Huang/180/orig -> origin/gh/H-Huang/180/orig 2025-12-04T10:32:19.3726629Z * [new branch] gh/H-Huang/182/base -> origin/gh/H-Huang/182/base 2025-12-04T10:32:19.3726805Z * [new branch] gh/H-Huang/182/head -> origin/gh/H-Huang/182/head 2025-12-04T10:32:19.3726989Z * [new branch] gh/H-Huang/182/orig -> origin/gh/H-Huang/182/orig 2025-12-04T10:32:19.3727169Z * [new branch] gh/H-Huang/226/base -> origin/gh/H-Huang/226/base 2025-12-04T10:32:19.3727347Z * [new branch] gh/H-Huang/226/head -> origin/gh/H-Huang/226/head 2025-12-04T10:32:19.3727556Z * [new branch] gh/H-Huang/226/orig -> origin/gh/H-Huang/226/orig 2025-12-04T10:32:19.3727736Z * [new branch] gh/H-Huang/228/base -> origin/gh/H-Huang/228/base 2025-12-04T10:32:19.3727914Z * [new branch] gh/H-Huang/228/head -> origin/gh/H-Huang/228/head 2025-12-04T10:32:19.3728098Z * [new branch] gh/H-Huang/228/orig -> origin/gh/H-Huang/228/orig 2025-12-04T10:32:19.3728293Z * [new branch] gh/IvanKobzarev/150/base -> origin/gh/IvanKobzarev/150/base 2025-12-04T10:32:19.3728495Z * [new branch] gh/IvanKobzarev/150/head -> origin/gh/IvanKobzarev/150/head 2025-12-04T10:32:19.3728699Z * [new branch] gh/IvanKobzarev/150/orig -> origin/gh/IvanKobzarev/150/orig 2025-12-04T10:32:19.3728906Z * [new branch] gh/IvanKobzarev/157/base -> origin/gh/IvanKobzarev/157/base 2025-12-04T10:32:19.3729104Z * [new branch] gh/IvanKobzarev/157/head -> origin/gh/IvanKobzarev/157/head 2025-12-04T10:32:19.3729309Z * [new branch] gh/IvanKobzarev/157/orig -> origin/gh/IvanKobzarev/157/orig 2025-12-04T10:32:19.3729506Z * [new branch] gh/IvanKobzarev/159/base -> origin/gh/IvanKobzarev/159/base 2025-12-04T10:32:19.3729759Z * [new branch] gh/IvanKobzarev/159/head -> origin/gh/IvanKobzarev/159/head 2025-12-04T10:32:19.3729995Z * [new branch] gh/IvanKobzarev/159/orig -> origin/gh/IvanKobzarev/159/orig 2025-12-04T10:32:19.3730193Z * [new branch] gh/IvanKobzarev/162/base -> origin/gh/IvanKobzarev/162/base 2025-12-04T10:32:19.3730395Z * [new branch] gh/IvanKobzarev/162/head -> origin/gh/IvanKobzarev/162/head 2025-12-04T10:32:19.3730597Z * [new branch] gh/IvanKobzarev/162/orig -> origin/gh/IvanKobzarev/162/orig 2025-12-04T10:32:19.3730800Z * [new branch] gh/IvanKobzarev/163/base -> origin/gh/IvanKobzarev/163/base 2025-12-04T10:32:19.3731007Z * [new branch] gh/IvanKobzarev/163/head -> origin/gh/IvanKobzarev/163/head 2025-12-04T10:32:19.3731208Z * [new branch] gh/IvanKobzarev/163/orig -> origin/gh/IvanKobzarev/163/orig 2025-12-04T10:32:19.3731405Z * [new branch] gh/IvanKobzarev/166/base -> origin/gh/IvanKobzarev/166/base 2025-12-04T10:32:19.3731614Z * [new branch] gh/IvanKobzarev/166/head -> origin/gh/IvanKobzarev/166/head 2025-12-04T10:32:19.3731816Z * [new branch] gh/IvanKobzarev/166/orig -> origin/gh/IvanKobzarev/166/orig 2025-12-04T10:32:19.3732013Z * [new branch] gh/IvanKobzarev/167/base -> origin/gh/IvanKobzarev/167/base 2025-12-04T10:32:19.3732214Z * [new branch] gh/IvanKobzarev/167/head -> origin/gh/IvanKobzarev/167/head 2025-12-04T10:32:19.3732420Z * [new branch] gh/IvanKobzarev/167/orig -> origin/gh/IvanKobzarev/167/orig 2025-12-04T10:32:19.3732618Z * [new branch] gh/IvanKobzarev/168/base -> origin/gh/IvanKobzarev/168/base 2025-12-04T10:32:19.3732824Z * [new branch] gh/IvanKobzarev/168/head -> origin/gh/IvanKobzarev/168/head 2025-12-04T10:32:19.3733027Z * [new branch] gh/IvanKobzarev/168/orig -> origin/gh/IvanKobzarev/168/orig 2025-12-04T10:32:19.3733229Z * [new branch] gh/IvanKobzarev/169/base -> origin/gh/IvanKobzarev/169/base 2025-12-04T10:32:19.3733433Z * [new branch] gh/IvanKobzarev/169/head -> origin/gh/IvanKobzarev/169/head 2025-12-04T10:32:19.3733635Z * [new branch] gh/IvanKobzarev/169/orig -> origin/gh/IvanKobzarev/169/orig 2025-12-04T10:32:19.3733833Z * [new branch] gh/IvanKobzarev/170/base -> origin/gh/IvanKobzarev/170/base 2025-12-04T10:32:19.3734055Z * [new branch] gh/IvanKobzarev/170/head -> origin/gh/IvanKobzarev/170/head 2025-12-04T10:32:19.3734257Z * [new branch] gh/IvanKobzarev/170/orig -> origin/gh/IvanKobzarev/170/orig 2025-12-04T10:32:19.3734503Z * [new branch] gh/IvanKobzarev/171/base -> origin/gh/IvanKobzarev/171/base 2025-12-04T10:32:19.3734705Z * [new branch] gh/IvanKobzarev/171/head -> origin/gh/IvanKobzarev/171/head 2025-12-04T10:32:19.3734911Z * [new branch] gh/IvanKobzarev/171/orig -> origin/gh/IvanKobzarev/171/orig 2025-12-04T10:32:19.3735111Z * [new branch] gh/IvanKobzarev/172/base -> origin/gh/IvanKobzarev/172/base 2025-12-04T10:32:19.3735317Z * [new branch] gh/IvanKobzarev/172/head -> origin/gh/IvanKobzarev/172/head 2025-12-04T10:32:19.3735522Z * [new branch] gh/IvanKobzarev/172/orig -> origin/gh/IvanKobzarev/172/orig 2025-12-04T10:32:19.3735723Z * [new branch] gh/IvanKobzarev/173/base -> origin/gh/IvanKobzarev/173/base 2025-12-04T10:32:19.3735926Z * [new branch] gh/IvanKobzarev/173/head -> origin/gh/IvanKobzarev/173/head 2025-12-04T10:32:19.3736124Z * [new branch] gh/IvanKobzarev/173/orig -> origin/gh/IvanKobzarev/173/orig 2025-12-04T10:32:19.3736329Z * [new branch] gh/IvanKobzarev/174/base -> origin/gh/IvanKobzarev/174/base 2025-12-04T10:32:19.3736535Z * [new branch] gh/IvanKobzarev/174/head -> origin/gh/IvanKobzarev/174/head 2025-12-04T10:32:19.3736733Z * [new branch] gh/IvanKobzarev/174/orig -> origin/gh/IvanKobzarev/174/orig 2025-12-04T10:32:19.3736959Z * [new branch] gh/IvanKobzarev/175/base -> origin/gh/IvanKobzarev/175/base 2025-12-04T10:32:19.3737161Z * [new branch] gh/IvanKobzarev/175/head -> origin/gh/IvanKobzarev/175/head 2025-12-04T10:32:19.3737358Z * [new branch] gh/IvanKobzarev/175/orig -> origin/gh/IvanKobzarev/175/orig 2025-12-04T10:32:19.3737565Z * [new branch] gh/IvanKobzarev/176/base -> origin/gh/IvanKobzarev/176/base 2025-12-04T10:32:19.3737766Z * [new branch] gh/IvanKobzarev/176/head -> origin/gh/IvanKobzarev/176/head 2025-12-04T10:32:19.3737965Z * [new branch] gh/IvanKobzarev/176/orig -> origin/gh/IvanKobzarev/176/orig 2025-12-04T10:32:19.3738169Z * [new branch] gh/IvanKobzarev/177/base -> origin/gh/IvanKobzarev/177/base 2025-12-04T10:32:19.3738371Z * [new branch] gh/IvanKobzarev/177/head -> origin/gh/IvanKobzarev/177/head 2025-12-04T10:32:19.3738571Z * [new branch] gh/IvanKobzarev/177/orig -> origin/gh/IvanKobzarev/177/orig 2025-12-04T10:32:19.3738778Z * [new branch] gh/IvanKobzarev/178/base -> origin/gh/IvanKobzarev/178/base 2025-12-04T10:32:19.3738982Z * [new branch] gh/IvanKobzarev/178/head -> origin/gh/IvanKobzarev/178/head 2025-12-04T10:32:19.3739179Z * [new branch] gh/IvanKobzarev/178/orig -> origin/gh/IvanKobzarev/178/orig 2025-12-04T10:32:19.3739384Z * [new branch] gh/IvanKobzarev/179/base -> origin/gh/IvanKobzarev/179/base 2025-12-04T10:32:19.3739646Z * [new branch] gh/IvanKobzarev/179/head -> origin/gh/IvanKobzarev/179/head 2025-12-04T10:32:19.3739846Z * [new branch] gh/IvanKobzarev/179/orig -> origin/gh/IvanKobzarev/179/orig 2025-12-04T10:32:19.3740050Z * [new branch] gh/IvanKobzarev/180/base -> origin/gh/IvanKobzarev/180/base 2025-12-04T10:32:19.3740252Z * [new branch] gh/IvanKobzarev/180/head -> origin/gh/IvanKobzarev/180/head 2025-12-04T10:32:19.3740456Z * [new branch] gh/IvanKobzarev/180/orig -> origin/gh/IvanKobzarev/180/orig 2025-12-04T10:32:19.3740658Z * [new branch] gh/IvanKobzarev/181/base -> origin/gh/IvanKobzarev/181/base 2025-12-04T10:32:19.3740859Z * [new branch] gh/IvanKobzarev/181/head -> origin/gh/IvanKobzarev/181/head 2025-12-04T10:32:19.3741060Z * [new branch] gh/IvanKobzarev/181/orig -> origin/gh/IvanKobzarev/181/orig 2025-12-04T10:32:19.3741262Z * [new branch] gh/IvanKobzarev/182/base -> origin/gh/IvanKobzarev/182/base 2025-12-04T10:32:19.3741512Z * [new branch] gh/IvanKobzarev/182/head -> origin/gh/IvanKobzarev/182/head 2025-12-04T10:32:19.3741709Z * [new branch] gh/IvanKobzarev/182/orig -> origin/gh/IvanKobzarev/182/orig 2025-12-04T10:32:19.3741910Z * [new branch] gh/IvanKobzarev/183/base -> origin/gh/IvanKobzarev/183/base 2025-12-04T10:32:19.3742112Z * [new branch] gh/IvanKobzarev/183/head -> origin/gh/IvanKobzarev/183/head 2025-12-04T10:32:19.3742316Z * [new branch] gh/IvanKobzarev/183/orig -> origin/gh/IvanKobzarev/183/orig 2025-12-04T10:32:19.3742518Z * [new branch] gh/IvanKobzarev/184/base -> origin/gh/IvanKobzarev/184/base 2025-12-04T10:32:19.3742715Z * [new branch] gh/IvanKobzarev/184/head -> origin/gh/IvanKobzarev/184/head 2025-12-04T10:32:19.3742917Z * [new branch] gh/IvanKobzarev/184/orig -> origin/gh/IvanKobzarev/184/orig 2025-12-04T10:32:19.3743120Z * [new branch] gh/NikhilAPatel/1/base -> origin/gh/NikhilAPatel/1/base 2025-12-04T10:32:19.3743326Z * [new branch] gh/NikhilAPatel/1/head -> origin/gh/NikhilAPatel/1/head 2025-12-04T10:32:19.3743527Z * [new branch] gh/NikhilAPatel/2/base -> origin/gh/NikhilAPatel/2/base 2025-12-04T10:32:19.3743726Z * [new branch] gh/NikhilAPatel/2/head -> origin/gh/NikhilAPatel/2/head 2025-12-04T10:32:19.3743952Z * [new branch] gh/NikhilAPatel/4/base -> origin/gh/NikhilAPatel/4/base 2025-12-04T10:32:19.3744154Z * [new branch] gh/NikhilAPatel/4/head -> origin/gh/NikhilAPatel/4/head 2025-12-04T10:32:19.3744358Z * [new branch] gh/NikhilAPatel/5/base -> origin/gh/NikhilAPatel/5/base 2025-12-04T10:32:19.3744553Z * [new branch] gh/NikhilAPatel/5/head -> origin/gh/NikhilAPatel/5/head 2025-12-04T10:32:19.3744751Z * [new branch] gh/NikhilAPatel/5/orig -> origin/gh/NikhilAPatel/5/orig 2025-12-04T10:32:19.3744947Z * [new branch] gh/PaliC/17/base -> origin/gh/PaliC/17/base 2025-12-04T10:32:19.3745135Z * [new branch] gh/PaliC/17/head -> origin/gh/PaliC/17/head 2025-12-04T10:32:19.3745315Z * [new branch] gh/PaliC/17/orig -> origin/gh/PaliC/17/orig 2025-12-04T10:32:19.3745493Z * [new branch] gh/PaliC/18/base -> origin/gh/PaliC/18/base 2025-12-04T10:32:19.3745674Z * [new branch] gh/PaliC/18/head -> origin/gh/PaliC/18/head 2025-12-04T10:32:19.3745851Z * [new branch] gh/PaliC/18/orig -> origin/gh/PaliC/18/orig 2025-12-04T10:32:19.3746027Z * [new branch] gh/PaliC/20/base -> origin/gh/PaliC/20/base 2025-12-04T10:32:19.3746206Z * [new branch] gh/PaliC/20/head -> origin/gh/PaliC/20/head 2025-12-04T10:32:19.3746380Z * [new branch] gh/PaliC/20/orig -> origin/gh/PaliC/20/orig 2025-12-04T10:32:19.3746556Z * [new branch] gh/PaliC/21/base -> origin/gh/PaliC/21/base 2025-12-04T10:32:19.3746729Z * [new branch] gh/PaliC/21/head -> origin/gh/PaliC/21/head 2025-12-04T10:32:19.3746905Z * [new branch] gh/PaliC/21/orig -> origin/gh/PaliC/21/orig 2025-12-04T10:32:19.3747082Z * [new branch] gh/PaliC/23/base -> origin/gh/PaliC/23/base 2025-12-04T10:32:19.3747259Z * [new branch] gh/PaliC/23/head -> origin/gh/PaliC/23/head 2025-12-04T10:32:19.3747435Z * [new branch] gh/PaliC/23/orig -> origin/gh/PaliC/23/orig 2025-12-04T10:32:19.3747606Z * [new branch] gh/PaliC/24/base -> origin/gh/PaliC/24/base 2025-12-04T10:32:19.3747785Z * [new branch] gh/PaliC/24/head -> origin/gh/PaliC/24/head 2025-12-04T10:32:19.3747965Z * [new branch] gh/PaliC/24/orig -> origin/gh/PaliC/24/orig 2025-12-04T10:32:19.3748137Z * [new branch] gh/PaliC/25/head -> origin/gh/PaliC/25/head 2025-12-04T10:32:19.3748343Z * [new branch] gh/PaliC/25/next -> origin/gh/PaliC/25/next 2025-12-04T10:32:19.3748520Z * [new branch] gh/PaliC/25/orig -> origin/gh/PaliC/25/orig 2025-12-04T10:32:19.3748693Z * [new branch] gh/PaliC/26/head -> origin/gh/PaliC/26/head 2025-12-04T10:32:19.3748871Z * [new branch] gh/PaliC/26/next -> origin/gh/PaliC/26/next 2025-12-04T10:32:19.3749048Z * [new branch] gh/PaliC/26/orig -> origin/gh/PaliC/26/orig 2025-12-04T10:32:19.3749221Z * [new branch] gh/PaliC/27/next -> origin/gh/PaliC/27/next 2025-12-04T10:32:19.3749396Z * [new branch] gh/PaliC/28/head -> origin/gh/PaliC/28/head 2025-12-04T10:32:19.3749625Z * [new branch] gh/PaliC/28/next -> origin/gh/PaliC/28/next 2025-12-04T10:32:19.3749798Z * [new branch] gh/PaliC/28/orig -> origin/gh/PaliC/28/orig 2025-12-04T10:32:19.3749984Z * [new branch] gh/PaliC/29/head -> origin/gh/PaliC/29/head 2025-12-04T10:32:19.3750161Z * [new branch] gh/PaliC/29/next -> origin/gh/PaliC/29/next 2025-12-04T10:32:19.3750332Z * [new branch] gh/PaliC/29/orig -> origin/gh/PaliC/29/orig 2025-12-04T10:32:19.3750508Z * [new branch] gh/PaliC/30/head -> origin/gh/PaliC/30/head 2025-12-04T10:32:19.3750717Z * [new branch] gh/PaliC/30/next -> origin/gh/PaliC/30/next 2025-12-04T10:32:19.3750896Z * [new branch] gh/PaliC/30/orig -> origin/gh/PaliC/30/orig 2025-12-04T10:32:19.3751071Z * [new branch] gh/PaliC/31/head -> origin/gh/PaliC/31/head 2025-12-04T10:32:19.3751243Z * [new branch] gh/PaliC/31/next -> origin/gh/PaliC/31/next 2025-12-04T10:32:19.3751421Z * [new branch] gh/PaliC/31/orig -> origin/gh/PaliC/31/orig 2025-12-04T10:32:19.3751612Z * [new branch] gh/PaulZhang12/25/base -> origin/gh/PaulZhang12/25/base 2025-12-04T10:32:19.3751813Z * [new branch] gh/PaulZhang12/25/head -> origin/gh/PaulZhang12/25/head 2025-12-04T10:32:19.3752009Z * [new branch] gh/PaulZhang12/25/orig -> origin/gh/PaulZhang12/25/orig 2025-12-04T10:32:19.3752208Z * [new branch] gh/PaulZhang12/28/base -> origin/gh/PaulZhang12/28/base 2025-12-04T10:32:19.3752405Z * [new branch] gh/PaulZhang12/28/head -> origin/gh/PaulZhang12/28/head 2025-12-04T10:32:19.3752605Z * [new branch] gh/PaulZhang12/28/orig -> origin/gh/PaulZhang12/28/orig 2025-12-04T10:32:19.3752800Z * [new branch] gh/PaulZhang12/31/base -> origin/gh/PaulZhang12/31/base 2025-12-04T10:32:19.3752990Z * [new branch] gh/PaulZhang12/31/head -> origin/gh/PaulZhang12/31/head 2025-12-04T10:32:19.3753186Z * [new branch] gh/PaulZhang12/31/orig -> origin/gh/PaulZhang12/31/orig 2025-12-04T10:32:19.3753380Z * [new branch] gh/PaulZhang12/37/base -> origin/gh/PaulZhang12/37/base 2025-12-04T10:32:19.3753575Z * [new branch] gh/PaulZhang12/37/head -> origin/gh/PaulZhang12/37/head 2025-12-04T10:32:19.3753768Z * [new branch] gh/PaulZhang12/37/orig -> origin/gh/PaulZhang12/37/orig 2025-12-04T10:32:19.3753963Z * [new branch] gh/PaulZhang12/40/base -> origin/gh/PaulZhang12/40/base 2025-12-04T10:32:19.3754157Z * [new branch] gh/PaulZhang12/40/head -> origin/gh/PaulZhang12/40/head 2025-12-04T10:32:19.3754350Z * [new branch] gh/PaulZhang12/40/orig -> origin/gh/PaulZhang12/40/orig 2025-12-04T10:32:19.3754548Z * [new branch] gh/PaulZhang12/42/base -> origin/gh/PaulZhang12/42/base 2025-12-04T10:32:19.3754737Z * [new branch] gh/PaulZhang12/42/head -> origin/gh/PaulZhang12/42/head 2025-12-04T10:32:19.3754930Z * [new branch] gh/PaulZhang12/43/base -> origin/gh/PaulZhang12/43/base 2025-12-04T10:32:19.3755153Z * [new branch] gh/PaulZhang12/43/head -> origin/gh/PaulZhang12/43/head 2025-12-04T10:32:19.3755347Z * [new branch] gh/PaulZhang12/43/orig -> origin/gh/PaulZhang12/43/orig 2025-12-04T10:32:19.3755539Z * [new branch] gh/PaulZhang12/44/base -> origin/gh/PaulZhang12/44/base 2025-12-04T10:32:19.3755730Z * [new branch] gh/PaulZhang12/44/head -> origin/gh/PaulZhang12/44/head 2025-12-04T10:32:19.3755926Z * [new branch] gh/PaulZhang12/45/base -> origin/gh/PaulZhang12/45/base 2025-12-04T10:32:19.3756122Z * [new branch] gh/PaulZhang12/45/head -> origin/gh/PaulZhang12/45/head 2025-12-04T10:32:19.3756314Z * [new branch] gh/PaulZhang12/45/orig -> origin/gh/PaulZhang12/45/orig 2025-12-04T10:32:19.3756569Z * [new branch] gh/PaulZhang12/46/base -> origin/gh/PaulZhang12/46/base 2025-12-04T10:32:19.3756759Z * [new branch] gh/PaulZhang12/46/head -> origin/gh/PaulZhang12/46/head 2025-12-04T10:32:19.3756950Z * [new branch] gh/PaulZhang12/46/orig -> origin/gh/PaulZhang12/46/orig 2025-12-04T10:32:19.3757140Z * [new branch] gh/PaulZhang12/47/base -> origin/gh/PaulZhang12/47/base 2025-12-04T10:32:19.3757331Z * [new branch] gh/PaulZhang12/47/head -> origin/gh/PaulZhang12/47/head 2025-12-04T10:32:19.3757558Z * [new branch] gh/PaulZhang12/47/orig -> origin/gh/PaulZhang12/47/orig 2025-12-04T10:32:19.3757751Z * [new branch] gh/PaulZhang12/48/base -> origin/gh/PaulZhang12/48/base 2025-12-04T10:32:19.3757941Z * [new branch] gh/PaulZhang12/48/head -> origin/gh/PaulZhang12/48/head 2025-12-04T10:32:19.3758127Z * [new branch] gh/PaulZhang12/48/orig -> origin/gh/PaulZhang12/48/orig 2025-12-04T10:32:19.3758322Z * [new branch] gh/SamGinzburg/11/base -> origin/gh/SamGinzburg/11/base 2025-12-04T10:32:19.3758513Z * [new branch] gh/SamGinzburg/11/head -> origin/gh/SamGinzburg/11/head 2025-12-04T10:32:19.3758710Z * [new branch] gh/SherlockNoMad/1/base -> origin/gh/SherlockNoMad/1/base 2025-12-04T10:32:19.3758909Z * [new branch] gh/SherlockNoMad/1/head -> origin/gh/SherlockNoMad/1/head 2025-12-04T10:32:19.3759107Z * [new branch] gh/SherlockNoMad/10/base -> origin/gh/SherlockNoMad/10/base 2025-12-04T10:32:19.3759309Z * [new branch] gh/SherlockNoMad/10/head -> origin/gh/SherlockNoMad/10/head 2025-12-04T10:32:19.3759510Z * [new branch] gh/SherlockNoMad/10/orig -> origin/gh/SherlockNoMad/10/orig 2025-12-04T10:32:19.3759752Z * [new branch] gh/SherlockNoMad/11/base -> origin/gh/SherlockNoMad/11/base 2025-12-04T10:32:19.3759946Z * [new branch] gh/SherlockNoMad/11/head -> origin/gh/SherlockNoMad/11/head 2025-12-04T10:32:19.3760179Z * [new branch] gh/SherlockNoMad/11/orig -> origin/gh/SherlockNoMad/11/orig 2025-12-04T10:32:19.3760405Z * [new branch] gh/SherlockNoMad/12/base -> origin/gh/SherlockNoMad/12/base 2025-12-04T10:32:19.3760760Z * [new branch] gh/SherlockNoMad/12/head -> origin/gh/SherlockNoMad/12/head 2025-12-04T10:32:19.3760998Z * [new branch] gh/SherlockNoMad/12/orig -> origin/gh/SherlockNoMad/12/orig 2025-12-04T10:32:19.3761225Z * [new branch] gh/SherlockNoMad/15/base -> origin/gh/SherlockNoMad/15/base 2025-12-04T10:32:19.3761482Z * [new branch] gh/SherlockNoMad/15/head -> origin/gh/SherlockNoMad/15/head 2025-12-04T10:32:19.3761723Z * [new branch] gh/SherlockNoMad/15/orig -> origin/gh/SherlockNoMad/15/orig 2025-12-04T10:32:19.3761948Z * [new branch] gh/SherlockNoMad/17/base -> origin/gh/SherlockNoMad/17/base 2025-12-04T10:32:19.3762202Z * [new branch] gh/SherlockNoMad/17/head -> origin/gh/SherlockNoMad/17/head 2025-12-04T10:32:19.3762445Z * [new branch] gh/SherlockNoMad/17/orig -> origin/gh/SherlockNoMad/17/orig 2025-12-04T10:32:19.3762713Z * [new branch] gh/SherlockNoMad/18/base -> origin/gh/SherlockNoMad/18/base 2025-12-04T10:32:19.3762959Z * [new branch] gh/SherlockNoMad/18/head -> origin/gh/SherlockNoMad/18/head 2025-12-04T10:32:19.3763313Z * [new branch] gh/SherlockNoMad/18/orig -> origin/gh/SherlockNoMad/18/orig 2025-12-04T10:32:19.3763546Z * [new branch] gh/SherlockNoMad/19/base -> origin/gh/SherlockNoMad/19/base 2025-12-04T10:32:19.3763807Z * [new branch] gh/SherlockNoMad/19/head -> origin/gh/SherlockNoMad/19/head 2025-12-04T10:32:19.3764093Z * [new branch] gh/SherlockNoMad/19/orig -> origin/gh/SherlockNoMad/19/orig 2025-12-04T10:32:19.3764374Z * [new branch] gh/SherlockNoMad/2/base -> origin/gh/SherlockNoMad/2/base 2025-12-04T10:32:19.3764618Z * [new branch] gh/SherlockNoMad/2/head -> origin/gh/SherlockNoMad/2/head 2025-12-04T10:32:19.3764982Z * [new branch] gh/SherlockNoMad/20/base -> origin/gh/SherlockNoMad/20/base 2025-12-04T10:32:19.3765209Z * [new branch] gh/SherlockNoMad/20/head -> origin/gh/SherlockNoMad/20/head 2025-12-04T10:32:19.3765476Z * [new branch] gh/SherlockNoMad/20/orig -> origin/gh/SherlockNoMad/20/orig 2025-12-04T10:32:19.3765789Z * [new branch] gh/SherlockNoMad/21/base -> origin/gh/SherlockNoMad/21/base 2025-12-04T10:32:19.3766017Z * [new branch] gh/SherlockNoMad/21/head -> origin/gh/SherlockNoMad/21/head 2025-12-04T10:32:19.3791010Z * [new branch] gh/SherlockNoMad/21/orig -> origin/gh/SherlockNoMad/21/orig 2025-12-04T10:32:19.3791280Z * [new branch] gh/SherlockNoMad/3/base -> origin/gh/SherlockNoMad/3/base 2025-12-04T10:32:19.3791497Z * [new branch] gh/SherlockNoMad/3/head -> origin/gh/SherlockNoMad/3/head 2025-12-04T10:32:19.3791706Z * [new branch] gh/SherlockNoMad/4/base -> origin/gh/SherlockNoMad/4/base 2025-12-04T10:32:19.3791933Z * [new branch] gh/SherlockNoMad/4/head -> origin/gh/SherlockNoMad/4/head 2025-12-04T10:32:19.3792132Z * [new branch] gh/SherlockNoMad/5/base -> origin/gh/SherlockNoMad/5/base 2025-12-04T10:32:19.3792354Z * [new branch] gh/SherlockNoMad/5/head -> origin/gh/SherlockNoMad/5/head 2025-12-04T10:32:19.3792622Z * [new branch] gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base 2025-12-04T10:32:19.3792861Z * [new branch] gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base 2025-12-04T10:32:19.3793123Z * [new branch] gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base 2025-12-04T10:32:19.3793347Z * [new branch] gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base 2025-12-04T10:32:19.3793561Z * [new branch] gh/StrongerXi/1/base -> origin/gh/StrongerXi/1/base 2025-12-04T10:32:19.3793765Z * [new branch] gh/StrongerXi/1/head -> origin/gh/StrongerXi/1/head 2025-12-04T10:32:19.3794019Z * [new branch] gh/StrongerXi/71/base -> origin/gh/StrongerXi/71/base 2025-12-04T10:32:19.3794225Z * [new branch] gh/StrongerXi/71/head -> origin/gh/StrongerXi/71/head 2025-12-04T10:32:19.3794437Z * [new branch] gh/StrongerXi/72/base -> origin/gh/StrongerXi/72/base 2025-12-04T10:32:19.3794656Z * [new branch] gh/StrongerXi/72/head -> origin/gh/StrongerXi/72/head 2025-12-04T10:32:19.3794844Z * [new branch] gh/StrongerXi/73/base -> origin/gh/StrongerXi/73/base 2025-12-04T10:32:19.3795049Z * [new branch] gh/StrongerXi/73/head -> origin/gh/StrongerXi/73/head 2025-12-04T10:32:19.3795243Z * [new branch] gh/StrongerXi/73/orig -> origin/gh/StrongerXi/73/orig 2025-12-04T10:32:19.3795429Z * [new branch] gh/XilunWu/160/base -> origin/gh/XilunWu/160/base 2025-12-04T10:32:19.3795713Z * [new branch] gh/XilunWu/160/head -> origin/gh/XilunWu/160/head 2025-12-04T10:32:19.3795900Z * [new branch] gh/XilunWu/160/orig -> origin/gh/XilunWu/160/orig 2025-12-04T10:32:19.3796106Z * [new branch] gh/XilunWu/163/base -> origin/gh/XilunWu/163/base 2025-12-04T10:32:19.3796301Z * [new branch] gh/XilunWu/163/head -> origin/gh/XilunWu/163/head 2025-12-04T10:32:19.3796491Z * [new branch] gh/XilunWu/163/orig -> origin/gh/XilunWu/163/orig 2025-12-04T10:32:19.3796675Z * [new branch] gh/XilunWu/168/base -> origin/gh/XilunWu/168/base 2025-12-04T10:32:19.3796866Z * [new branch] gh/XilunWu/168/head -> origin/gh/XilunWu/168/head 2025-12-04T10:32:19.3797057Z * [new branch] gh/XilunWu/168/orig -> origin/gh/XilunWu/168/orig 2025-12-04T10:32:19.3797246Z * [new branch] gh/XilunWu/169/base -> origin/gh/XilunWu/169/base 2025-12-04T10:32:19.3797441Z * [new branch] gh/XilunWu/169/head -> origin/gh/XilunWu/169/head 2025-12-04T10:32:19.3797632Z * [new branch] gh/XilunWu/169/orig -> origin/gh/XilunWu/169/orig 2025-12-04T10:32:19.3797816Z * [new branch] gh/XilunWu/170/base -> origin/gh/XilunWu/170/base 2025-12-04T10:32:19.3798004Z * [new branch] gh/XilunWu/170/head -> origin/gh/XilunWu/170/head 2025-12-04T10:32:19.3798216Z * [new branch] gh/XilunWu/170/orig -> origin/gh/XilunWu/170/orig 2025-12-04T10:32:19.3798405Z * [new branch] gh/XilunWu/171/base -> origin/gh/XilunWu/171/base 2025-12-04T10:32:19.3798586Z * [new branch] gh/XilunWu/171/head -> origin/gh/XilunWu/171/head 2025-12-04T10:32:19.3798764Z * [new branch] gh/XilunWu/171/orig -> origin/gh/XilunWu/171/orig 2025-12-04T10:32:19.3798941Z * [new branch] gh/XilunWu/173/base -> origin/gh/XilunWu/173/base 2025-12-04T10:32:19.3799118Z * [new branch] gh/XilunWu/173/head -> origin/gh/XilunWu/173/head 2025-12-04T10:32:19.3799297Z * [new branch] gh/XilunWu/173/orig -> origin/gh/XilunWu/173/orig 2025-12-04T10:32:19.3799478Z * [new branch] gh/XilunWu/175/base -> origin/gh/XilunWu/175/base 2025-12-04T10:32:19.3799701Z * [new branch] gh/XilunWu/175/head -> origin/gh/XilunWu/175/head 2025-12-04T10:32:19.3799887Z * [new branch] gh/XilunWu/175/orig -> origin/gh/XilunWu/175/orig 2025-12-04T10:32:19.3800066Z * [new branch] gh/XilunWu/176/base -> origin/gh/XilunWu/176/base 2025-12-04T10:32:19.3800244Z * [new branch] gh/XilunWu/176/head -> origin/gh/XilunWu/176/head 2025-12-04T10:32:19.3800423Z * [new branch] gh/XilunWu/176/orig -> origin/gh/XilunWu/176/orig 2025-12-04T10:32:19.3800609Z * [new branch] gh/XuehaiPan/14/base -> origin/gh/XuehaiPan/14/base 2025-12-04T10:32:19.3800796Z * [new branch] gh/XuehaiPan/14/head -> origin/gh/XuehaiPan/14/head 2025-12-04T10:32:19.3800982Z * [new branch] gh/XuehaiPan/14/orig -> origin/gh/XuehaiPan/14/orig 2025-12-04T10:32:19.3801172Z * [new branch] gh/XuehaiPan/179/base -> origin/gh/XuehaiPan/179/base 2025-12-04T10:32:19.3801363Z * [new branch] gh/XuehaiPan/179/head -> origin/gh/XuehaiPan/179/head 2025-12-04T10:32:19.3801550Z * [new branch] gh/XuehaiPan/179/orig -> origin/gh/XuehaiPan/179/orig 2025-12-04T10:32:19.3801736Z * [new branch] gh/XuehaiPan/249/base -> origin/gh/XuehaiPan/249/base 2025-12-04T10:32:19.3801922Z * [new branch] gh/XuehaiPan/249/head -> origin/gh/XuehaiPan/249/head 2025-12-04T10:32:19.3802108Z * [new branch] gh/XuehaiPan/249/orig -> origin/gh/XuehaiPan/249/orig 2025-12-04T10:32:19.3802297Z * [new branch] gh/XuehaiPan/253/base -> origin/gh/XuehaiPan/253/base 2025-12-04T10:32:19.3802537Z * [new branch] gh/XuehaiPan/253/head -> origin/gh/XuehaiPan/253/head 2025-12-04T10:32:19.3802728Z * [new branch] gh/XuehaiPan/253/orig -> origin/gh/XuehaiPan/253/orig 2025-12-04T10:32:19.3802914Z * [new branch] gh/XuehaiPan/254/base -> origin/gh/XuehaiPan/254/base 2025-12-04T10:32:19.3803102Z * [new branch] gh/XuehaiPan/254/head -> origin/gh/XuehaiPan/254/head 2025-12-04T10:32:19.3803291Z * [new branch] gh/XuehaiPan/254/orig -> origin/gh/XuehaiPan/254/orig 2025-12-04T10:32:19.3803476Z * [new branch] gh/XuehaiPan/255/base -> origin/gh/XuehaiPan/255/base 2025-12-04T10:32:19.3803663Z * [new branch] gh/XuehaiPan/255/head -> origin/gh/XuehaiPan/255/head 2025-12-04T10:32:19.3803849Z * [new branch] gh/XuehaiPan/255/orig -> origin/gh/XuehaiPan/255/orig 2025-12-04T10:32:19.3804038Z * [new branch] gh/XuehaiPan/271/base -> origin/gh/XuehaiPan/271/base 2025-12-04T10:32:19.3804227Z * [new branch] gh/XuehaiPan/271/head -> origin/gh/XuehaiPan/271/head 2025-12-04T10:32:19.3804412Z * [new branch] gh/XuehaiPan/271/orig -> origin/gh/XuehaiPan/271/orig 2025-12-04T10:32:19.3804599Z * [new branch] gh/XuehaiPan/343/base -> origin/gh/XuehaiPan/343/base 2025-12-04T10:32:19.3804818Z * [new branch] gh/XuehaiPan/343/head -> origin/gh/XuehaiPan/343/head 2025-12-04T10:32:19.3805005Z * [new branch] gh/XuehaiPan/343/orig -> origin/gh/XuehaiPan/343/orig 2025-12-04T10:32:19.3805188Z * [new branch] gh/XuehaiPan/347/base -> origin/gh/XuehaiPan/347/base 2025-12-04T10:32:19.3805375Z * [new branch] gh/XuehaiPan/347/head -> origin/gh/XuehaiPan/347/head 2025-12-04T10:32:19.3805561Z * [new branch] gh/XuehaiPan/347/orig -> origin/gh/XuehaiPan/347/orig 2025-12-04T10:32:19.3805746Z * [new branch] gh/XuehaiPan/348/base -> origin/gh/XuehaiPan/348/base 2025-12-04T10:32:19.3805939Z * [new branch] gh/XuehaiPan/348/head -> origin/gh/XuehaiPan/348/head 2025-12-04T10:32:19.3806126Z * [new branch] gh/XuehaiPan/348/orig -> origin/gh/XuehaiPan/348/orig 2025-12-04T10:32:19.3806311Z * [new branch] gh/XuehaiPan/350/base -> origin/gh/XuehaiPan/350/base 2025-12-04T10:32:19.3806506Z * [new branch] gh/XuehaiPan/350/head -> origin/gh/XuehaiPan/350/head 2025-12-04T10:32:19.3806692Z * [new branch] gh/XuehaiPan/350/orig -> origin/gh/XuehaiPan/350/orig 2025-12-04T10:32:19.3806875Z * [new branch] gh/XuehaiPan/365/base -> origin/gh/XuehaiPan/365/base 2025-12-04T10:32:19.3807064Z * [new branch] gh/XuehaiPan/365/head -> origin/gh/XuehaiPan/365/head 2025-12-04T10:32:19.3807252Z * [new branch] gh/XuehaiPan/365/orig -> origin/gh/XuehaiPan/365/orig 2025-12-04T10:32:19.3807444Z * [new branch] gh/XuehaiPan/366/base -> origin/gh/XuehaiPan/366/base 2025-12-04T10:32:19.3807635Z * [new branch] gh/XuehaiPan/366/head -> origin/gh/XuehaiPan/366/head 2025-12-04T10:32:19.3807824Z * [new branch] gh/XuehaiPan/370/base -> origin/gh/XuehaiPan/370/base 2025-12-04T10:32:19.3808015Z * [new branch] gh/XuehaiPan/370/head -> origin/gh/XuehaiPan/370/head 2025-12-04T10:32:19.3808209Z * [new branch] gh/XuehaiPan/370/orig -> origin/gh/XuehaiPan/370/orig 2025-12-04T10:32:19.3808394Z * [new branch] gh/XuehaiPan/390/base -> origin/gh/XuehaiPan/390/base 2025-12-04T10:32:19.3808583Z * [new branch] gh/XuehaiPan/390/head -> origin/gh/XuehaiPan/390/head 2025-12-04T10:32:19.3808770Z * [new branch] gh/XuehaiPan/390/orig -> origin/gh/XuehaiPan/390/orig 2025-12-04T10:32:19.3808952Z * [new branch] gh/XuehaiPan/391/base -> origin/gh/XuehaiPan/391/base 2025-12-04T10:32:19.3809178Z * [new branch] gh/XuehaiPan/391/head -> origin/gh/XuehaiPan/391/head 2025-12-04T10:32:19.3809366Z * [new branch] gh/XuehaiPan/391/orig -> origin/gh/XuehaiPan/391/orig 2025-12-04T10:32:19.3809553Z * [new branch] gh/XuehaiPan/392/base -> origin/gh/XuehaiPan/392/base 2025-12-04T10:32:19.3809788Z * [new branch] gh/XuehaiPan/392/head -> origin/gh/XuehaiPan/392/head 2025-12-04T10:32:19.3809974Z * [new branch] gh/XuehaiPan/392/orig -> origin/gh/XuehaiPan/392/orig 2025-12-04T10:32:19.3810156Z * [new branch] gh/XuehaiPan/394/base -> origin/gh/XuehaiPan/394/base 2025-12-04T10:32:19.3810345Z * [new branch] gh/XuehaiPan/394/head -> origin/gh/XuehaiPan/394/head 2025-12-04T10:32:19.3810530Z * [new branch] gh/XuehaiPan/394/orig -> origin/gh/XuehaiPan/394/orig 2025-12-04T10:32:19.3810714Z * [new branch] gh/XuehaiPan/397/base -> origin/gh/XuehaiPan/397/base 2025-12-04T10:32:19.3810910Z * [new branch] gh/XuehaiPan/397/head -> origin/gh/XuehaiPan/397/head 2025-12-04T10:32:19.3811099Z * [new branch] gh/XuehaiPan/397/orig -> origin/gh/XuehaiPan/397/orig 2025-12-04T10:32:19.3811287Z * [new branch] gh/XuehaiPan/398/base -> origin/gh/XuehaiPan/398/base 2025-12-04T10:32:19.3811532Z * [new branch] gh/XuehaiPan/398/head -> origin/gh/XuehaiPan/398/head 2025-12-04T10:32:19.3811717Z * [new branch] gh/XuehaiPan/398/orig -> origin/gh/XuehaiPan/398/orig 2025-12-04T10:32:19.3811901Z * [new branch] gh/XuehaiPan/399/base -> origin/gh/XuehaiPan/399/base 2025-12-04T10:32:19.3812087Z * [new branch] gh/XuehaiPan/399/head -> origin/gh/XuehaiPan/399/head 2025-12-04T10:32:19.3812270Z * [new branch] gh/XuehaiPan/399/orig -> origin/gh/XuehaiPan/399/orig 2025-12-04T10:32:19.3812460Z * [new branch] gh/XuehaiPan/400/base -> origin/gh/XuehaiPan/400/base 2025-12-04T10:32:19.3812653Z * [new branch] gh/XuehaiPan/400/head -> origin/gh/XuehaiPan/400/head 2025-12-04T10:32:19.3812838Z * [new branch] gh/XuehaiPan/400/orig -> origin/gh/XuehaiPan/400/orig 2025-12-04T10:32:19.3813032Z * [new branch] gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base 2025-12-04T10:32:19.3813228Z * [new branch] gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head 2025-12-04T10:32:19.3813420Z * [new branch] gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig 2025-12-04T10:32:19.3813611Z * [new branch] gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base 2025-12-04T10:32:19.3813803Z * [new branch] gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head 2025-12-04T10:32:19.3813989Z * [new branch] gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base 2025-12-04T10:32:19.3814175Z * [new branch] gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head 2025-12-04T10:32:19.3814364Z * [new branch] gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base 2025-12-04T10:32:19.3814551Z * [new branch] gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head 2025-12-04T10:32:19.3814743Z * [new branch] gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base 2025-12-04T10:32:19.3814938Z * [new branch] gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head 2025-12-04T10:32:19.3815123Z * [new branch] gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base 2025-12-04T10:32:19.3815309Z * [new branch] gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head 2025-12-04T10:32:19.3815493Z * [new branch] gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base 2025-12-04T10:32:19.3815678Z * [new branch] gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head 2025-12-04T10:32:19.3815898Z * [new branch] gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base 2025-12-04T10:32:19.3816090Z * [new branch] gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head 2025-12-04T10:32:19.3816276Z * [new branch] gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig 2025-12-04T10:32:19.3816464Z * [new branch] gh/aakhundov/1/base -> origin/gh/aakhundov/1/base 2025-12-04T10:32:19.3816647Z * [new branch] gh/aakhundov/1/head -> origin/gh/aakhundov/1/head 2025-12-04T10:32:19.3816832Z * [new branch] gh/aakhundov/2/base -> origin/gh/aakhundov/2/base 2025-12-04T10:32:19.3817015Z * [new branch] gh/aakhundov/2/head -> origin/gh/aakhundov/2/head 2025-12-04T10:32:19.3817198Z * [new branch] gh/aditew01/openblas -> origin/gh/aditew01/openblas 2025-12-04T10:32:19.3817386Z * [new branch] gh/aditew01/sbgemm -> origin/gh/aditew01/sbgemm 2025-12-04T10:32:19.3817573Z * [new branch] gh/aditew01/vecbf16 -> origin/gh/aditew01/vecbf16 2025-12-04T10:32:19.3817752Z * [new branch] gh/albanD/4/base -> origin/gh/albanD/4/base 2025-12-04T10:32:19.3817929Z * [new branch] gh/albanD/4/head -> origin/gh/albanD/4/head 2025-12-04T10:32:19.3818103Z * [new branch] gh/albanD/4/orig -> origin/gh/albanD/4/orig 2025-12-04T10:32:19.3818393Z * [new branch] gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init 2025-12-04T10:32:19.3818669Z * [new branch] gh/alexsamardzic/12/base -> origin/gh/alexsamardzic/12/base 2025-12-04T10:32:19.3818875Z * [new branch] gh/alexsamardzic/12/head -> origin/gh/alexsamardzic/12/head 2025-12-04T10:32:19.3819071Z * [new branch] gh/alexsamardzic/12/orig -> origin/gh/alexsamardzic/12/orig 2025-12-04T10:32:19.3819268Z * [new branch] gh/alexsamardzic/14/base -> origin/gh/alexsamardzic/14/base 2025-12-04T10:32:19.3819469Z * [new branch] gh/alexsamardzic/14/head -> origin/gh/alexsamardzic/14/head 2025-12-04T10:32:19.3819710Z * [new branch] gh/alexsamardzic/14/orig -> origin/gh/alexsamardzic/14/orig 2025-12-04T10:32:19.3819911Z * [new branch] gh/alexsamardzic/15/base -> origin/gh/alexsamardzic/15/base 2025-12-04T10:32:19.3820115Z * [new branch] gh/alexsamardzic/15/head -> origin/gh/alexsamardzic/15/head 2025-12-04T10:32:19.3820311Z * [new branch] gh/alexsamardzic/15/orig -> origin/gh/alexsamardzic/15/orig 2025-12-04T10:32:19.3820502Z * [new branch] gh/amjames/18/base -> origin/gh/amjames/18/base 2025-12-04T10:32:19.3820683Z * [new branch] gh/amjames/18/head -> origin/gh/amjames/18/head 2025-12-04T10:32:19.3820860Z * [new branch] gh/amjames/18/orig -> origin/gh/amjames/18/orig 2025-12-04T10:32:19.3821048Z * [new branch] gh/andrewor14/35/base -> origin/gh/andrewor14/35/base 2025-12-04T10:32:19.3821238Z * [new branch] gh/andrewor14/35/head -> origin/gh/andrewor14/35/head 2025-12-04T10:32:19.3821423Z * [new branch] gh/andrewor14/35/orig -> origin/gh/andrewor14/35/orig 2025-12-04T10:32:19.3821610Z * [new branch] gh/andrewor14/50/base -> origin/gh/andrewor14/50/base 2025-12-04T10:32:19.3821792Z * [new branch] gh/andrewor14/50/head -> origin/gh/andrewor14/50/head 2025-12-04T10:32:19.3821981Z * [new branch] gh/andrewor14/50/orig -> origin/gh/andrewor14/50/orig 2025-12-04T10:32:19.3822167Z * [new branch] gh/andyanwang/30/base -> origin/gh/andyanwang/30/base 2025-12-04T10:32:19.3822350Z * [new branch] gh/andyanwang/30/orig -> origin/gh/andyanwang/30/orig 2025-12-04T10:32:19.3822539Z * [new branch] gh/andyanwang/31/base -> origin/gh/andyanwang/31/base 2025-12-04T10:32:19.3822766Z * [new branch] gh/andyanwang/31/orig -> origin/gh/andyanwang/31/orig 2025-12-04T10:32:19.3822951Z * [new branch] gh/andyanwang/39/base -> origin/gh/andyanwang/39/base 2025-12-04T10:32:19.3823140Z * [new branch] gh/andyanwang/39/head -> origin/gh/andyanwang/39/head 2025-12-04T10:32:19.3823327Z * [new branch] gh/andyanwang/39/orig -> origin/gh/andyanwang/39/orig 2025-12-04T10:32:19.3823514Z * [new branch] gh/andyanwang/42/base -> origin/gh/andyanwang/42/base 2025-12-04T10:32:19.3823701Z * [new branch] gh/andyanwang/42/head -> origin/gh/andyanwang/42/head 2025-12-04T10:32:19.3823889Z * [new branch] gh/andyanwang/42/orig -> origin/gh/andyanwang/42/orig 2025-12-04T10:32:19.3824075Z * [new branch] gh/andyanwang/45/base -> origin/gh/andyanwang/45/base 2025-12-04T10:32:19.3824264Z * [new branch] gh/andyanwang/45/head -> origin/gh/andyanwang/45/head 2025-12-04T10:32:19.3824455Z * [new branch] gh/andyanwang/45/orig -> origin/gh/andyanwang/45/orig 2025-12-04T10:32:19.3824638Z * [new branch] gh/angelayi/107/base -> origin/gh/angelayi/107/base 2025-12-04T10:32:19.3824828Z * [new branch] gh/angelayi/107/head -> origin/gh/angelayi/107/head 2025-12-04T10:32:19.3825053Z * [new branch] gh/angelayi/114/base -> origin/gh/angelayi/114/base 2025-12-04T10:32:19.3825241Z * [new branch] gh/angelayi/114/head -> origin/gh/angelayi/114/head 2025-12-04T10:32:19.3825423Z * [new branch] gh/angelayi/114/orig -> origin/gh/angelayi/114/orig 2025-12-04T10:32:19.3825605Z * [new branch] gh/angelayi/116/base -> origin/gh/angelayi/116/base 2025-12-04T10:32:19.3825783Z * [new branch] gh/angelayi/116/head -> origin/gh/angelayi/116/head 2025-12-04T10:32:19.3825964Z * [new branch] gh/angelayi/116/orig -> origin/gh/angelayi/116/orig 2025-12-04T10:32:19.3826150Z * [new branch] gh/angelayi/122/base -> origin/gh/angelayi/122/base 2025-12-04T10:32:19.3826330Z * [new branch] gh/angelayi/122/head -> origin/gh/angelayi/122/head 2025-12-04T10:32:19.3826513Z * [new branch] gh/angelayi/122/orig -> origin/gh/angelayi/122/orig 2025-12-04T10:32:19.3826695Z * [new branch] gh/angelayi/124/base -> origin/gh/angelayi/124/base 2025-12-04T10:32:19.3826874Z * [new branch] gh/angelayi/124/head -> origin/gh/angelayi/124/head 2025-12-04T10:32:19.3827057Z * [new branch] gh/angelayi/124/orig -> origin/gh/angelayi/124/orig 2025-12-04T10:32:19.3827239Z * [new branch] gh/angelayi/128/base -> origin/gh/angelayi/128/base 2025-12-04T10:32:19.3827418Z * [new branch] gh/angelayi/128/head -> origin/gh/angelayi/128/head 2025-12-04T10:32:19.3827600Z * [new branch] gh/angelayi/128/orig -> origin/gh/angelayi/128/orig 2025-12-04T10:32:19.3827784Z * [new branch] gh/angelayi/131/base -> origin/gh/angelayi/131/base 2025-12-04T10:32:19.3827962Z * [new branch] gh/angelayi/131/head -> origin/gh/angelayi/131/head 2025-12-04T10:32:19.3828143Z * [new branch] gh/angelayi/131/orig -> origin/gh/angelayi/131/orig 2025-12-04T10:32:19.3828327Z * [new branch] gh/angelayi/132/base -> origin/gh/angelayi/132/base 2025-12-04T10:32:19.3828505Z * [new branch] gh/angelayi/132/head -> origin/gh/angelayi/132/head 2025-12-04T10:32:19.3828688Z * [new branch] gh/angelayi/132/orig -> origin/gh/angelayi/132/orig 2025-12-04T10:32:19.3828869Z * [new branch] gh/angelayi/133/base -> origin/gh/angelayi/133/base 2025-12-04T10:32:19.3829048Z * [new branch] gh/angelayi/133/head -> origin/gh/angelayi/133/head 2025-12-04T10:32:19.3829229Z * [new branch] gh/angelayi/133/orig -> origin/gh/angelayi/133/orig 2025-12-04T10:32:19.3829439Z * [new branch] gh/angelayi/134/base -> origin/gh/angelayi/134/base 2025-12-04T10:32:19.3829662Z * [new branch] gh/angelayi/134/head -> origin/gh/angelayi/134/head 2025-12-04T10:32:19.3829846Z * [new branch] gh/angelayi/134/orig -> origin/gh/angelayi/134/orig 2025-12-04T10:32:19.3830028Z * [new branch] gh/angelayi/135/base -> origin/gh/angelayi/135/base 2025-12-04T10:32:19.3830210Z * [new branch] gh/angelayi/135/head -> origin/gh/angelayi/135/head 2025-12-04T10:32:19.3830391Z * [new branch] gh/angelayi/135/orig -> origin/gh/angelayi/135/orig 2025-12-04T10:32:19.3830570Z * [new branch] gh/angelayi/136/base -> origin/gh/angelayi/136/base 2025-12-04T10:32:19.3830750Z * [new branch] gh/angelayi/136/head -> origin/gh/angelayi/136/head 2025-12-04T10:32:19.3830931Z * [new branch] gh/angelayi/136/orig -> origin/gh/angelayi/136/orig 2025-12-04T10:32:19.3831113Z * [new branch] gh/angelayi/137/base -> origin/gh/angelayi/137/base 2025-12-04T10:32:19.3831293Z * [new branch] gh/angelayi/137/head -> origin/gh/angelayi/137/head 2025-12-04T10:32:19.3831473Z * [new branch] gh/angelayi/137/orig -> origin/gh/angelayi/137/orig 2025-12-04T10:32:19.3831687Z * [new branch] gh/angelayi/138/base -> origin/gh/angelayi/138/base 2025-12-04T10:32:19.3831872Z * [new branch] gh/angelayi/138/head -> origin/gh/angelayi/138/head 2025-12-04T10:32:19.3832055Z * [new branch] gh/angelayi/138/orig -> origin/gh/angelayi/138/orig 2025-12-04T10:32:19.3832235Z * [new branch] gh/angelayi/139/base -> origin/gh/angelayi/139/base 2025-12-04T10:32:19.3832415Z * [new branch] gh/angelayi/139/head -> origin/gh/angelayi/139/head 2025-12-04T10:32:19.3832595Z * [new branch] gh/angelayi/139/orig -> origin/gh/angelayi/139/orig 2025-12-04T10:32:19.3832777Z * [new branch] gh/angelayi/140/base -> origin/gh/angelayi/140/base 2025-12-04T10:32:19.3832960Z * [new branch] gh/angelayi/140/head -> origin/gh/angelayi/140/head 2025-12-04T10:32:19.3833139Z * [new branch] gh/angelayi/140/orig -> origin/gh/angelayi/140/orig 2025-12-04T10:32:19.3833327Z * [new branch] gh/angelayi/141/base -> origin/gh/angelayi/141/base 2025-12-04T10:32:19.3833506Z * [new branch] gh/angelayi/141/head -> origin/gh/angelayi/141/head 2025-12-04T10:32:19.3833684Z * [new branch] gh/angelayi/141/orig -> origin/gh/angelayi/141/orig 2025-12-04T10:32:19.3833870Z * [new branch] gh/angelayi/142/base -> origin/gh/angelayi/142/base 2025-12-04T10:32:19.3834051Z * [new branch] gh/angelayi/142/head -> origin/gh/angelayi/142/head 2025-12-04T10:32:19.3834231Z * [new branch] gh/angelayi/142/orig -> origin/gh/angelayi/142/orig 2025-12-04T10:32:19.3834413Z * [new branch] gh/angelayi/143/base -> origin/gh/angelayi/143/base 2025-12-04T10:32:19.3834675Z * [new branch] gh/angelayi/143/head -> origin/gh/angelayi/143/head 2025-12-04T10:32:19.3834857Z * [new branch] gh/angelayi/143/orig -> origin/gh/angelayi/143/orig 2025-12-04T10:32:19.3835040Z * [new branch] gh/angelayi/144/base -> origin/gh/angelayi/144/base 2025-12-04T10:32:19.3835222Z * [new branch] gh/angelayi/144/head -> origin/gh/angelayi/144/head 2025-12-04T10:32:19.3835402Z * [new branch] gh/angelayi/144/orig -> origin/gh/angelayi/144/orig 2025-12-04T10:32:19.3835590Z * [new branch] gh/anijain2305/753/base -> origin/gh/anijain2305/753/base 2025-12-04T10:32:19.3835780Z * [new branch] gh/anijain2305/753/head -> origin/gh/anijain2305/753/head 2025-12-04T10:32:19.3835998Z * [new branch] gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig 2025-12-04T10:32:19.3836190Z * [new branch] gh/anijain2305/810/base -> origin/gh/anijain2305/810/base 2025-12-04T10:32:19.3836380Z * [new branch] gh/anijain2305/810/head -> origin/gh/anijain2305/810/head 2025-12-04T10:32:19.3836564Z * [new branch] gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig 2025-12-04T10:32:19.3836750Z * [new branch] gh/anijain2305/854/base -> origin/gh/anijain2305/854/base 2025-12-04T10:32:19.3836936Z * [new branch] gh/anijain2305/854/head -> origin/gh/anijain2305/854/head 2025-12-04T10:32:19.3837122Z * [new branch] gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig 2025-12-04T10:32:19.3837311Z * [new branch] gh/anijain2305/864/base -> origin/gh/anijain2305/864/base 2025-12-04T10:32:19.3837496Z * [new branch] gh/anijain2305/864/head -> origin/gh/anijain2305/864/head 2025-12-04T10:32:19.3837683Z * [new branch] gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig 2025-12-04T10:32:19.3837875Z * [new branch] gh/anijain2305/870/base -> origin/gh/anijain2305/870/base 2025-12-04T10:32:19.3838061Z * [new branch] gh/anijain2305/870/head -> origin/gh/anijain2305/870/head 2025-12-04T10:32:19.3838285Z * [new branch] gh/anijain2305/870/orig -> origin/gh/anijain2305/870/orig 2025-12-04T10:32:19.3838472Z * [new branch] gh/anijain2305/873/base -> origin/gh/anijain2305/873/base 2025-12-04T10:32:19.3838655Z * [new branch] gh/anijain2305/873/head -> origin/gh/anijain2305/873/head 2025-12-04T10:32:19.3838842Z * [new branch] gh/anijain2305/873/orig -> origin/gh/anijain2305/873/orig 2025-12-04T10:32:19.3839028Z * [new branch] gh/anijain2305/894/base -> origin/gh/anijain2305/894/base 2025-12-04T10:32:19.3839213Z * [new branch] gh/anijain2305/894/head -> origin/gh/anijain2305/894/head 2025-12-04T10:32:19.3839400Z * [new branch] gh/anijain2305/894/orig -> origin/gh/anijain2305/894/orig 2025-12-04T10:32:19.3839625Z * [new branch] gh/anijain2305/895/base -> origin/gh/anijain2305/895/base 2025-12-04T10:32:19.3839813Z * [new branch] gh/anijain2305/895/head -> origin/gh/anijain2305/895/head 2025-12-04T10:32:19.3840003Z * [new branch] gh/anijain2305/895/orig -> origin/gh/anijain2305/895/orig 2025-12-04T10:32:19.3840193Z * [new branch] gh/anijain2305/910/base -> origin/gh/anijain2305/910/base 2025-12-04T10:32:19.3840377Z * [new branch] gh/anijain2305/910/head -> origin/gh/anijain2305/910/head 2025-12-04T10:32:19.3840565Z * [new branch] gh/anijain2305/910/orig -> origin/gh/anijain2305/910/orig 2025-12-04T10:32:19.3840751Z * [new branch] gh/anijain2305/919/base -> origin/gh/anijain2305/919/base 2025-12-04T10:32:19.3840937Z * [new branch] gh/anijain2305/919/head -> origin/gh/anijain2305/919/head 2025-12-04T10:32:19.3841124Z * [new branch] gh/anijain2305/919/orig -> origin/gh/anijain2305/919/orig 2025-12-04T10:32:19.3841310Z * [new branch] gh/anijain2305/922/base -> origin/gh/anijain2305/922/base 2025-12-04T10:32:19.3841492Z * [new branch] gh/anijain2305/922/head -> origin/gh/anijain2305/922/head 2025-12-04T10:32:19.3841680Z * [new branch] gh/anijain2305/922/orig -> origin/gh/anijain2305/922/orig 2025-12-04T10:32:19.3841866Z * [new branch] gh/anijain2305/932/base -> origin/gh/anijain2305/932/base 2025-12-04T10:32:19.3842052Z * [new branch] gh/anijain2305/932/head -> origin/gh/anijain2305/932/head 2025-12-04T10:32:19.3842237Z * [new branch] gh/anijain2305/932/orig -> origin/gh/anijain2305/932/orig 2025-12-04T10:32:19.3842420Z * [new branch] gh/anijain2305/940/base -> origin/gh/anijain2305/940/base 2025-12-04T10:32:19.3842645Z * [new branch] gh/anijain2305/940/head -> origin/gh/anijain2305/940/head 2025-12-04T10:32:19.3842832Z * [new branch] gh/anijain2305/940/orig -> origin/gh/anijain2305/940/orig 2025-12-04T10:32:19.3843016Z * [new branch] gh/anijain2305/941/base -> origin/gh/anijain2305/941/base 2025-12-04T10:32:19.3843204Z * [new branch] gh/anijain2305/941/head -> origin/gh/anijain2305/941/head 2025-12-04T10:32:19.3843395Z * [new branch] gh/anijain2305/941/orig -> origin/gh/anijain2305/941/orig 2025-12-04T10:32:19.3843579Z * [new branch] gh/anijain2305/942/base -> origin/gh/anijain2305/942/base 2025-12-04T10:32:19.3843766Z * [new branch] gh/anijain2305/942/head -> origin/gh/anijain2305/942/head 2025-12-04T10:32:19.3843951Z * [new branch] gh/anijain2305/942/orig -> origin/gh/anijain2305/942/orig 2025-12-04T10:32:19.3844137Z * [new branch] gh/anijain2305/943/base -> origin/gh/anijain2305/943/base 2025-12-04T10:32:19.3844329Z * [new branch] gh/anijain2305/943/head -> origin/gh/anijain2305/943/head 2025-12-04T10:32:19.3844517Z * [new branch] gh/anijain2305/943/orig -> origin/gh/anijain2305/943/orig 2025-12-04T10:32:19.3844700Z * [new branch] gh/anijain2305/944/base -> origin/gh/anijain2305/944/base 2025-12-04T10:32:19.3844930Z * [new branch] gh/anijain2305/944/head -> origin/gh/anijain2305/944/head 2025-12-04T10:32:19.3845115Z * [new branch] gh/anijain2305/944/orig -> origin/gh/anijain2305/944/orig 2025-12-04T10:32:19.3845301Z * [new branch] gh/anijain2305/945/base -> origin/gh/anijain2305/945/base 2025-12-04T10:32:19.3845487Z * [new branch] gh/anijain2305/945/head -> origin/gh/anijain2305/945/head 2025-12-04T10:32:19.3845672Z * [new branch] gh/anijain2305/945/orig -> origin/gh/anijain2305/945/orig 2025-12-04T10:32:19.3845861Z * [new branch] gh/anijain2305/946/base -> origin/gh/anijain2305/946/base 2025-12-04T10:32:19.3846046Z * [new branch] gh/anijain2305/946/head -> origin/gh/anijain2305/946/head 2025-12-04T10:32:19.3846232Z * [new branch] gh/anijain2305/946/orig -> origin/gh/anijain2305/946/orig 2025-12-04T10:32:19.3846416Z * [new branch] gh/anijain2305/947/base -> origin/gh/anijain2305/947/base 2025-12-04T10:32:19.3846605Z * [new branch] gh/anijain2305/947/head -> origin/gh/anijain2305/947/head 2025-12-04T10:32:19.3846791Z * [new branch] gh/anijain2305/947/orig -> origin/gh/anijain2305/947/orig 2025-12-04T10:32:19.3846974Z * [new branch] gh/anijain2305/948/base -> origin/gh/anijain2305/948/base 2025-12-04T10:32:19.3847158Z * [new branch] gh/anijain2305/948/head -> origin/gh/anijain2305/948/head 2025-12-04T10:32:19.3847343Z * [new branch] gh/anijain2305/948/orig -> origin/gh/anijain2305/948/orig 2025-12-04T10:32:19.3847532Z * [new branch] gh/anijain2305/949/base -> origin/gh/anijain2305/949/base 2025-12-04T10:32:19.3847717Z * [new branch] gh/anijain2305/949/head -> origin/gh/anijain2305/949/head 2025-12-04T10:32:19.3847901Z * [new branch] gh/anijain2305/949/orig -> origin/gh/anijain2305/949/orig 2025-12-04T10:32:19.3848090Z * [new branch] gh/anijain2305/950/base -> origin/gh/anijain2305/950/base 2025-12-04T10:32:19.3848280Z * [new branch] gh/anijain2305/950/head -> origin/gh/anijain2305/950/head 2025-12-04T10:32:19.3848462Z * [new branch] gh/anijain2305/950/orig -> origin/gh/anijain2305/950/orig 2025-12-04T10:32:19.3848647Z * [new branch] gh/anijain2305/951/base -> origin/gh/anijain2305/951/base 2025-12-04T10:32:19.3848833Z * [new branch] gh/anijain2305/951/head -> origin/gh/anijain2305/951/head 2025-12-04T10:32:19.3849018Z * [new branch] gh/anijain2305/951/orig -> origin/gh/anijain2305/951/orig 2025-12-04T10:32:19.3849246Z * [new branch] gh/anijain2305/952/base -> origin/gh/anijain2305/952/base 2025-12-04T10:32:19.3849432Z * [new branch] gh/anijain2305/952/head -> origin/gh/anijain2305/952/head 2025-12-04T10:32:19.3849651Z * [new branch] gh/anijain2305/952/orig -> origin/gh/anijain2305/952/orig 2025-12-04T10:32:19.3849840Z * [new branch] gh/anijain2305/953/base -> origin/gh/anijain2305/953/base 2025-12-04T10:32:19.3850027Z * [new branch] gh/anijain2305/953/head -> origin/gh/anijain2305/953/head 2025-12-04T10:32:19.3850211Z * [new branch] gh/anijain2305/953/orig -> origin/gh/anijain2305/953/orig 2025-12-04T10:32:19.3850397Z * [new branch] gh/anijain2305/954/base -> origin/gh/anijain2305/954/base 2025-12-04T10:32:19.3850584Z * [new branch] gh/anijain2305/954/head -> origin/gh/anijain2305/954/head 2025-12-04T10:32:19.3850775Z * [new branch] gh/anijain2305/954/orig -> origin/gh/anijain2305/954/orig 2025-12-04T10:32:19.3850961Z * [new branch] gh/anijain2305/955/base -> origin/gh/anijain2305/955/base 2025-12-04T10:32:19.3851147Z * [new branch] gh/anijain2305/955/head -> origin/gh/anijain2305/955/head 2025-12-04T10:32:19.3851336Z * [new branch] gh/anijain2305/955/orig -> origin/gh/anijain2305/955/orig 2025-12-04T10:32:19.3851572Z * [new branch] gh/anijain2305/956/base -> origin/gh/anijain2305/956/base 2025-12-04T10:32:19.3851757Z * [new branch] gh/anijain2305/956/head -> origin/gh/anijain2305/956/head 2025-12-04T10:32:19.3851943Z * [new branch] gh/anijain2305/956/orig -> origin/gh/anijain2305/956/orig 2025-12-04T10:32:19.3852128Z * [new branch] gh/anijain2305/957/base -> origin/gh/anijain2305/957/base 2025-12-04T10:32:19.3852313Z * [new branch] gh/anijain2305/957/head -> origin/gh/anijain2305/957/head 2025-12-04T10:32:19.3852504Z * [new branch] gh/anijain2305/957/orig -> origin/gh/anijain2305/957/orig 2025-12-04T10:32:19.3852693Z * [new branch] gh/anijain2305/958/base -> origin/gh/anijain2305/958/base 2025-12-04T10:32:19.3852878Z * [new branch] gh/anijain2305/958/head -> origin/gh/anijain2305/958/head 2025-12-04T10:32:19.3853066Z * [new branch] gh/anijain2305/958/orig -> origin/gh/anijain2305/958/orig 2025-12-04T10:32:19.3853252Z * [new branch] gh/anijain2305/959/base -> origin/gh/anijain2305/959/base 2025-12-04T10:32:19.3853439Z * [new branch] gh/anijain2305/959/head -> origin/gh/anijain2305/959/head 2025-12-04T10:32:19.3853629Z * [new branch] gh/anijain2305/959/orig -> origin/gh/anijain2305/959/orig 2025-12-04T10:32:19.3853815Z * [new branch] gh/anijain2305/960/base -> origin/gh/anijain2305/960/base 2025-12-04T10:32:19.3853998Z * [new branch] gh/anijain2305/960/head -> origin/gh/anijain2305/960/head 2025-12-04T10:32:19.3854185Z * [new branch] gh/anijain2305/960/orig -> origin/gh/anijain2305/960/orig 2025-12-04T10:32:19.3854369Z * [new branch] gh/anijain2305/961/base -> origin/gh/anijain2305/961/base 2025-12-04T10:32:19.3854555Z * [new branch] gh/anijain2305/961/head -> origin/gh/anijain2305/961/head 2025-12-04T10:32:19.3854741Z * [new branch] gh/anijain2305/961/orig -> origin/gh/anijain2305/961/orig 2025-12-04T10:32:19.3854930Z * [new branch] gh/anijain2305/962/base -> origin/gh/anijain2305/962/base 2025-12-04T10:32:19.3855115Z * [new branch] gh/anijain2305/962/head -> origin/gh/anijain2305/962/head 2025-12-04T10:32:19.3855303Z * [new branch] gh/anijain2305/962/orig -> origin/gh/anijain2305/962/orig 2025-12-04T10:32:19.3855492Z * [new branch] gh/anijain2305/963/base -> origin/gh/anijain2305/963/base 2025-12-04T10:32:19.3855732Z * [new branch] gh/anijain2305/963/head -> origin/gh/anijain2305/963/head 2025-12-04T10:32:19.3855919Z * [new branch] gh/anijain2305/963/orig -> origin/gh/anijain2305/963/orig 2025-12-04T10:32:19.3856103Z * [new branch] gh/anijain2305/964/base -> origin/gh/anijain2305/964/base 2025-12-04T10:32:19.3856286Z * [new branch] gh/anijain2305/964/head -> origin/gh/anijain2305/964/head 2025-12-04T10:32:19.3856474Z * [new branch] gh/anijain2305/964/orig -> origin/gh/anijain2305/964/orig 2025-12-04T10:32:19.3856661Z * [new branch] gh/anijain2305/965/base -> origin/gh/anijain2305/965/base 2025-12-04T10:32:19.3856847Z * [new branch] gh/anijain2305/965/head -> origin/gh/anijain2305/965/head 2025-12-04T10:32:19.3857032Z * [new branch] gh/anijain2305/965/orig -> origin/gh/anijain2305/965/orig 2025-12-04T10:32:19.3857216Z * [new branch] gh/anijain2305/966/base -> origin/gh/anijain2305/966/base 2025-12-04T10:32:19.3857407Z * [new branch] gh/anijain2305/966/head -> origin/gh/anijain2305/966/head 2025-12-04T10:32:19.3857594Z * [new branch] gh/anijain2305/966/orig -> origin/gh/anijain2305/966/orig 2025-12-04T10:32:19.3857778Z * [new branch] gh/anijain2305/967/base -> origin/gh/anijain2305/967/base 2025-12-04T10:32:19.3858006Z * [new branch] gh/anijain2305/967/head -> origin/gh/anijain2305/967/head 2025-12-04T10:32:19.3858191Z * [new branch] gh/anijain2305/967/orig -> origin/gh/anijain2305/967/orig 2025-12-04T10:32:19.3858375Z * [new branch] gh/anijain2305/968/base -> origin/gh/anijain2305/968/base 2025-12-04T10:32:19.3858562Z * [new branch] gh/anijain2305/968/head -> origin/gh/anijain2305/968/head 2025-12-04T10:32:19.3858748Z * [new branch] gh/anijain2305/968/orig -> origin/gh/anijain2305/968/orig 2025-12-04T10:32:19.3858935Z * [new branch] gh/anijain2305/969/base -> origin/gh/anijain2305/969/base 2025-12-04T10:32:19.3859124Z * [new branch] gh/anijain2305/969/head -> origin/gh/anijain2305/969/head 2025-12-04T10:32:19.3859311Z * [new branch] gh/anijain2305/969/orig -> origin/gh/anijain2305/969/orig 2025-12-04T10:32:19.3859497Z * [new branch] gh/anijain2305/970/base -> origin/gh/anijain2305/970/base 2025-12-04T10:32:19.3859739Z * [new branch] gh/anijain2305/970/head -> origin/gh/anijain2305/970/head 2025-12-04T10:32:19.3859924Z * [new branch] gh/anijain2305/970/orig -> origin/gh/anijain2305/970/orig 2025-12-04T10:32:19.3860109Z * [new branch] gh/anjali411/216/base -> origin/gh/anjali411/216/base 2025-12-04T10:32:19.3860294Z * [new branch] gh/anjali411/216/head -> origin/gh/anjali411/216/head 2025-12-04T10:32:19.3860474Z * [new branch] gh/anjali411/216/orig -> origin/gh/anjali411/216/orig 2025-12-04T10:32:19.3860659Z * [new branch] gh/anshul-si/1/base -> origin/gh/anshul-si/1/base 2025-12-04T10:32:19.3860841Z * [new branch] gh/anshul-si/1/head -> origin/gh/anshul-si/1/head 2025-12-04T10:32:19.3861017Z * [new branch] gh/anshul-si/2/base -> origin/gh/anshul-si/2/base 2025-12-04T10:32:19.3861194Z * [new branch] gh/anshul-si/2/head -> origin/gh/anshul-si/2/head 2025-12-04T10:32:19.3861375Z * [new branch] gh/anshul-si/3/base -> origin/gh/anshul-si/3/base 2025-12-04T10:32:19.3861549Z * [new branch] gh/anshul-si/3/head -> origin/gh/anshul-si/3/head 2025-12-04T10:32:19.3861728Z * [new branch] gh/anshul-si/4/base -> origin/gh/anshul-si/4/base 2025-12-04T10:32:19.3861907Z * [new branch] gh/anshul-si/4/head -> origin/gh/anshul-si/4/head 2025-12-04T10:32:19.3862081Z * [new branch] gh/anshul-si/5/base -> origin/gh/anshul-si/5/base 2025-12-04T10:32:19.3862257Z * [new branch] gh/anshul-si/5/head -> origin/gh/anshul-si/5/head 2025-12-04T10:32:19.3862505Z * [new branch] gh/anshul-si/53/base -> origin/gh/anshul-si/53/base 2025-12-04T10:32:19.3862686Z * [new branch] gh/anshul-si/53/head -> origin/gh/anshul-si/53/head 2025-12-04T10:32:19.3862868Z * [new branch] gh/anshul-si/58/base -> origin/gh/anshul-si/58/base 2025-12-04T10:32:19.3863051Z * [new branch] gh/anshul-si/58/head -> origin/gh/anshul-si/58/head 2025-12-04T10:32:19.3863229Z * [new branch] gh/anshul-si/66/base -> origin/gh/anshul-si/66/base 2025-12-04T10:32:19.3863408Z * [new branch] gh/anshul-si/66/head -> origin/gh/anshul-si/66/head 2025-12-04T10:32:19.3863587Z * [new branch] gh/anshul-si/66/orig -> origin/gh/anshul-si/66/orig 2025-12-04T10:32:19.3863763Z * [new branch] gh/anshul-si/67/base -> origin/gh/anshul-si/67/base 2025-12-04T10:32:19.3863949Z * [new branch] gh/anshul-si/67/head -> origin/gh/anshul-si/67/head 2025-12-04T10:32:19.3864134Z * [new branch] gh/anshul-si/67/orig -> origin/gh/anshul-si/67/orig 2025-12-04T10:32:19.3864310Z * [new branch] gh/anshul-si/68/base -> origin/gh/anshul-si/68/base 2025-12-04T10:32:19.3864491Z * [new branch] gh/anshul-si/68/head -> origin/gh/anshul-si/68/head 2025-12-04T10:32:19.3864718Z * [new branch] gh/anshul-si/68/orig -> origin/gh/anshul-si/68/orig 2025-12-04T10:32:19.3864897Z * [new branch] gh/anshul-si/69/base -> origin/gh/anshul-si/69/base 2025-12-04T10:32:19.3865078Z * [new branch] gh/anshul-si/69/head -> origin/gh/anshul-si/69/head 2025-12-04T10:32:19.3865254Z * [new branch] gh/anshul-si/69/orig -> origin/gh/anshul-si/69/orig 2025-12-04T10:32:19.3865433Z * [new branch] gh/anshul-si/70/base -> origin/gh/anshul-si/70/base 2025-12-04T10:32:19.3865616Z * [new branch] gh/anshul-si/70/head -> origin/gh/anshul-si/70/head 2025-12-04T10:32:19.3865794Z * [new branch] gh/anshul-si/70/orig -> origin/gh/anshul-si/70/orig 2025-12-04T10:32:19.3865975Z * [new branch] gh/anshul-si/71/base -> origin/gh/anshul-si/71/base 2025-12-04T10:32:19.3866156Z * [new branch] gh/anshul-si/71/head -> origin/gh/anshul-si/71/head 2025-12-04T10:32:19.3866337Z * [new branch] gh/anshul-si/71/orig -> origin/gh/anshul-si/71/orig 2025-12-04T10:32:19.3866518Z * [new branch] gh/anshul-si/72/base -> origin/gh/anshul-si/72/base 2025-12-04T10:32:19.3866700Z * [new branch] gh/anshul-si/72/head -> origin/gh/anshul-si/72/head 2025-12-04T10:32:19.3866878Z * [new branch] gh/anshul-si/72/orig -> origin/gh/anshul-si/72/orig 2025-12-04T10:32:19.3867061Z * [new branch] gh/anshul-si/73/base -> origin/gh/anshul-si/73/base 2025-12-04T10:32:19.3867242Z * [new branch] gh/anshul-si/73/head -> origin/gh/anshul-si/73/head 2025-12-04T10:32:19.3867422Z * [new branch] gh/anshul-si/73/orig -> origin/gh/anshul-si/73/orig 2025-12-04T10:32:19.3867605Z * [new branch] gh/aorenste/132/base -> origin/gh/aorenste/132/base 2025-12-04T10:32:19.3867789Z * [new branch] gh/aorenste/132/head -> origin/gh/aorenste/132/head 2025-12-04T10:32:19.3867974Z * [new branch] gh/aorenste/134/base -> origin/gh/aorenste/134/base 2025-12-04T10:32:19.3868159Z * [new branch] gh/aorenste/134/head -> origin/gh/aorenste/134/head 2025-12-04T10:32:19.3868340Z * [new branch] gh/aorenste/134/orig -> origin/gh/aorenste/134/orig 2025-12-04T10:32:19.3868520Z * [new branch] gh/aorenste/139/base -> origin/gh/aorenste/139/base 2025-12-04T10:32:19.3868703Z * [new branch] gh/aorenste/139/head -> origin/gh/aorenste/139/head 2025-12-04T10:32:19.3868924Z * [new branch] gh/aorenste/139/orig -> origin/gh/aorenste/139/orig 2025-12-04T10:32:19.3869106Z * [new branch] gh/aorenste/141/base -> origin/gh/aorenste/141/base 2025-12-04T10:32:19.3869289Z * [new branch] gh/aorenste/141/head -> origin/gh/aorenste/141/head 2025-12-04T10:32:19.3869469Z * [new branch] gh/aorenste/145/base -> origin/gh/aorenste/145/base 2025-12-04T10:32:19.3869701Z * [new branch] gh/aorenste/145/head -> origin/gh/aorenste/145/head 2025-12-04T10:32:19.3869886Z * [new branch] gh/aorenste/145/orig -> origin/gh/aorenste/145/orig 2025-12-04T10:32:19.3870068Z * [new branch] gh/aorenste/146/base -> origin/gh/aorenste/146/base 2025-12-04T10:32:19.3870251Z * [new branch] gh/aorenste/146/head -> origin/gh/aorenste/146/head 2025-12-04T10:32:19.3870434Z * [new branch] gh/aorenste/146/orig -> origin/gh/aorenste/146/orig 2025-12-04T10:32:19.3870620Z * [new branch] gh/aorenste/147/base -> origin/gh/aorenste/147/base 2025-12-04T10:32:19.3870806Z * [new branch] gh/aorenste/147/head -> origin/gh/aorenste/147/head 2025-12-04T10:32:19.3870989Z * [new branch] gh/aorenste/147/orig -> origin/gh/aorenste/147/orig 2025-12-04T10:32:19.3871219Z * [new branch] gh/aorenste/148/base -> origin/gh/aorenste/148/base 2025-12-04T10:32:19.3871401Z * [new branch] gh/aorenste/148/head -> origin/gh/aorenste/148/head 2025-12-04T10:32:19.3871585Z * [new branch] gh/aorenste/148/orig -> origin/gh/aorenste/148/orig 2025-12-04T10:32:19.3871765Z * [new branch] gh/aorenste/149/base -> origin/gh/aorenste/149/base 2025-12-04T10:32:19.3871950Z * [new branch] gh/aorenste/149/head -> origin/gh/aorenste/149/head 2025-12-04T10:32:19.3872134Z * [new branch] gh/aorenste/149/orig -> origin/gh/aorenste/149/orig 2025-12-04T10:32:19.3872317Z * [new branch] gh/aorenste/150/base -> origin/gh/aorenste/150/base 2025-12-04T10:32:19.3872502Z * [new branch] gh/aorenste/150/head -> origin/gh/aorenste/150/head 2025-12-04T10:32:19.3872683Z * [new branch] gh/aorenste/150/orig -> origin/gh/aorenste/150/orig 2025-12-04T10:32:19.3872868Z * [new branch] gh/aorenste/151/base -> origin/gh/aorenste/151/base 2025-12-04T10:32:19.3873051Z * [new branch] gh/aorenste/151/head -> origin/gh/aorenste/151/head 2025-12-04T10:32:19.3873233Z * [new branch] gh/aorenste/151/orig -> origin/gh/aorenste/151/orig 2025-12-04T10:32:19.3873419Z * [new branch] gh/aorenste/152/base -> origin/gh/aorenste/152/base 2025-12-04T10:32:19.3873601Z * [new branch] gh/aorenste/152/head -> origin/gh/aorenste/152/head 2025-12-04T10:32:19.3873782Z * [new branch] gh/aorenste/152/orig -> origin/gh/aorenste/152/orig 2025-12-04T10:32:19.3873967Z * [new branch] gh/aorenste/153/base -> origin/gh/aorenste/153/base 2025-12-04T10:32:19.3874150Z * [new branch] gh/aorenste/153/head -> origin/gh/aorenste/153/head 2025-12-04T10:32:19.3874334Z * [new branch] gh/aorenste/153/orig -> origin/gh/aorenste/153/orig 2025-12-04T10:32:19.3874517Z * [new branch] gh/aorenste/154/base -> origin/gh/aorenste/154/base 2025-12-04T10:32:19.3874699Z * [new branch] gh/aorenste/154/head -> origin/gh/aorenste/154/head 2025-12-04T10:32:19.3874878Z * [new branch] gh/aorenste/154/orig -> origin/gh/aorenste/154/orig 2025-12-04T10:32:19.3875060Z * [new branch] gh/aorenste/155/base -> origin/gh/aorenste/155/base 2025-12-04T10:32:19.3875244Z * [new branch] gh/aorenste/155/head -> origin/gh/aorenste/155/head 2025-12-04T10:32:19.3875425Z * [new branch] gh/aorenste/155/orig -> origin/gh/aorenste/155/orig 2025-12-04T10:32:19.3875659Z * [new branch] gh/aorenste/156/base -> origin/gh/aorenste/156/base 2025-12-04T10:32:19.3875843Z * [new branch] gh/aorenste/156/head -> origin/gh/aorenste/156/head 2025-12-04T10:32:19.3876023Z * [new branch] gh/aorenste/156/orig -> origin/gh/aorenste/156/orig 2025-12-04T10:32:19.3876208Z * [new branch] gh/aorenste/157/base -> origin/gh/aorenste/157/base 2025-12-04T10:32:19.3876394Z * [new branch] gh/aorenste/157/head -> origin/gh/aorenste/157/head 2025-12-04T10:32:19.3876573Z * [new branch] gh/aorenste/157/orig -> origin/gh/aorenste/157/orig 2025-12-04T10:32:19.3876755Z * [new branch] gh/aorenste/158/base -> origin/gh/aorenste/158/base 2025-12-04T10:32:19.3876933Z * [new branch] gh/aorenste/158/head -> origin/gh/aorenste/158/head 2025-12-04T10:32:19.3877117Z * [new branch] gh/aorenste/158/orig -> origin/gh/aorenste/158/orig 2025-12-04T10:32:19.3877305Z * [new branch] gh/aorenste/159/base -> origin/gh/aorenste/159/base 2025-12-04T10:32:19.3877483Z * [new branch] gh/aorenste/159/head -> origin/gh/aorenste/159/head 2025-12-04T10:32:19.3877663Z * [new branch] gh/aorenste/159/orig -> origin/gh/aorenste/159/orig 2025-12-04T10:32:19.3877895Z * [new branch] gh/avikchaudhuri/1/base -> origin/gh/avikchaudhuri/1/base 2025-12-04T10:32:19.3878091Z * [new branch] gh/avikchaudhuri/1/head -> origin/gh/avikchaudhuri/1/head 2025-12-04T10:32:19.3878288Z * [new branch] gh/avikchaudhuri/2/base -> origin/gh/avikchaudhuri/2/base 2025-12-04T10:32:19.3878487Z * [new branch] gh/avikchaudhuri/2/head -> origin/gh/avikchaudhuri/2/head 2025-12-04T10:32:19.3878680Z * [new branch] gh/avikchaudhuri/2/orig -> origin/gh/avikchaudhuri/2/orig 2025-12-04T10:32:19.3878873Z * [new branch] gh/bdhirsh/666/base -> origin/gh/bdhirsh/666/base 2025-12-04T10:32:19.3879057Z * [new branch] gh/bdhirsh/666/head -> origin/gh/bdhirsh/666/head 2025-12-04T10:32:19.3879237Z * [new branch] gh/bdhirsh/666/orig -> origin/gh/bdhirsh/666/orig 2025-12-04T10:32:19.3879424Z * [new branch] gh/bdhirsh/668/base -> origin/gh/bdhirsh/668/base 2025-12-04T10:32:19.3879660Z * [new branch] gh/bdhirsh/668/head -> origin/gh/bdhirsh/668/head 2025-12-04T10:32:19.3879840Z * [new branch] gh/bdhirsh/668/orig -> origin/gh/bdhirsh/668/orig 2025-12-04T10:32:19.3880023Z * [new branch] gh/bdhirsh/669/base -> origin/gh/bdhirsh/669/base 2025-12-04T10:32:19.3880203Z * [new branch] gh/bdhirsh/669/head -> origin/gh/bdhirsh/669/head 2025-12-04T10:32:19.3880379Z * [new branch] gh/bdhirsh/669/orig -> origin/gh/bdhirsh/669/orig 2025-12-04T10:32:19.3880561Z * [new branch] gh/bdhirsh/670/base -> origin/gh/bdhirsh/670/base 2025-12-04T10:32:19.3880742Z * [new branch] gh/bdhirsh/670/head -> origin/gh/bdhirsh/670/head 2025-12-04T10:32:19.3880921Z * [new branch] gh/bdhirsh/670/orig -> origin/gh/bdhirsh/670/orig 2025-12-04T10:32:19.3881100Z * [new branch] gh/bdhirsh/672/base -> origin/gh/bdhirsh/672/base 2025-12-04T10:32:19.3881281Z * [new branch] gh/bdhirsh/672/head -> origin/gh/bdhirsh/672/head 2025-12-04T10:32:19.3881462Z * [new branch] gh/bdhirsh/672/orig -> origin/gh/bdhirsh/672/orig 2025-12-04T10:32:19.3881643Z * [new branch] gh/bdhirsh/675/base -> origin/gh/bdhirsh/675/base 2025-12-04T10:32:19.3881819Z * [new branch] gh/bdhirsh/675/head -> origin/gh/bdhirsh/675/head 2025-12-04T10:32:19.3882000Z * [new branch] gh/bdhirsh/675/orig -> origin/gh/bdhirsh/675/orig 2025-12-04T10:32:19.3882245Z * [new branch] gh/bdhirsh/676/base -> origin/gh/bdhirsh/676/base 2025-12-04T10:32:19.3882422Z * [new branch] gh/bdhirsh/676/head -> origin/gh/bdhirsh/676/head 2025-12-04T10:32:19.3882602Z * [new branch] gh/bdhirsh/676/orig -> origin/gh/bdhirsh/676/orig 2025-12-04T10:32:19.3882675Z * [new branch] gh/bdhirsh/677/base -> origin/gh/bdhirsh/677/base 2025-12-04T10:32:19.3882753Z * [new branch] gh/bdhirsh/677/head -> origin/gh/bdhirsh/677/head 2025-12-04T10:32:19.3882824Z * [new branch] gh/bdhirsh/677/orig -> origin/gh/bdhirsh/677/orig 2025-12-04T10:32:19.3882895Z * [new branch] gh/bdhirsh/678/base -> origin/gh/bdhirsh/678/base 2025-12-04T10:32:19.3882968Z * [new branch] gh/bdhirsh/678/head -> origin/gh/bdhirsh/678/head 2025-12-04T10:32:19.3883038Z * [new branch] gh/bdhirsh/678/orig -> origin/gh/bdhirsh/678/orig 2025-12-04T10:32:19.3883109Z * [new branch] gh/bdhirsh/679/base -> origin/gh/bdhirsh/679/base 2025-12-04T10:32:19.3883181Z * [new branch] gh/bdhirsh/679/head -> origin/gh/bdhirsh/679/head 2025-12-04T10:32:19.3883251Z * [new branch] gh/bdhirsh/679/orig -> origin/gh/bdhirsh/679/orig 2025-12-04T10:32:19.3883322Z * [new branch] gh/bdhirsh/680/base -> origin/gh/bdhirsh/680/base 2025-12-04T10:32:19.3883448Z * [new branch] gh/bdhirsh/680/head -> origin/gh/bdhirsh/680/head 2025-12-04T10:32:19.3883520Z * [new branch] gh/bdhirsh/680/orig -> origin/gh/bdhirsh/680/orig 2025-12-04T10:32:19.3883592Z * [new branch] gh/bdhirsh/681/base -> origin/gh/bdhirsh/681/base 2025-12-04T10:32:19.3883663Z * [new branch] gh/bdhirsh/681/head -> origin/gh/bdhirsh/681/head 2025-12-04T10:32:19.3883734Z * [new branch] gh/bdhirsh/681/orig -> origin/gh/bdhirsh/681/orig 2025-12-04T10:32:19.3883834Z * [new branch] gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base 2025-12-04T10:32:19.3883924Z * [new branch] gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head 2025-12-04T10:32:19.3884011Z * [new branch] gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig 2025-12-04T10:32:19.3884102Z * [new branch] gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base 2025-12-04T10:32:19.3884188Z * [new branch] gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head 2025-12-04T10:32:19.3884275Z * [new branch] gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig 2025-12-04T10:32:19.3884365Z * [new branch] gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base 2025-12-04T10:32:19.3884452Z * [new branch] gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head 2025-12-04T10:32:19.3884540Z * [new branch] gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig 2025-12-04T10:32:19.3884632Z * [new branch] gh/benjaminglass1/107/base -> origin/gh/benjaminglass1/107/base 2025-12-04T10:32:19.3884718Z * [new branch] gh/benjaminglass1/107/head -> origin/gh/benjaminglass1/107/head 2025-12-04T10:32:19.3884804Z * [new branch] gh/benjaminglass1/107/orig -> origin/gh/benjaminglass1/107/orig 2025-12-04T10:32:19.3884895Z * [new branch] gh/benjaminglass1/108/base -> origin/gh/benjaminglass1/108/base 2025-12-04T10:32:19.3884982Z * [new branch] gh/benjaminglass1/108/head -> origin/gh/benjaminglass1/108/head 2025-12-04T10:32:19.3885071Z * [new branch] gh/benjaminglass1/108/orig -> origin/gh/benjaminglass1/108/orig 2025-12-04T10:32:19.3885157Z * [new branch] gh/benjaminglass1/109/base -> origin/gh/benjaminglass1/109/base 2025-12-04T10:32:19.3885242Z * [new branch] gh/benjaminglass1/109/head -> origin/gh/benjaminglass1/109/head 2025-12-04T10:32:19.3885375Z * [new branch] gh/benjaminglass1/109/orig -> origin/gh/benjaminglass1/109/orig 2025-12-04T10:32:19.3885458Z * [new branch] gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base 2025-12-04T10:32:19.3885541Z * [new branch] gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head 2025-12-04T10:32:19.3885627Z * [new branch] gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig 2025-12-04T10:32:19.3885707Z * [new branch] gh/bobrenjc93/570/base -> origin/gh/bobrenjc93/570/base 2025-12-04T10:32:19.3885783Z * [new branch] gh/bobrenjc93/570/head -> origin/gh/bobrenjc93/570/head 2025-12-04T10:32:19.3885861Z * [new branch] gh/bobrenjc93/570/orig -> origin/gh/bobrenjc93/570/orig 2025-12-04T10:32:19.3885938Z * [new branch] gh/bobrenjc93/604/base -> origin/gh/bobrenjc93/604/base 2025-12-04T10:32:19.3886013Z * [new branch] gh/bobrenjc93/604/head -> origin/gh/bobrenjc93/604/head 2025-12-04T10:32:19.3886090Z * [new branch] gh/bobrenjc93/604/orig -> origin/gh/bobrenjc93/604/orig 2025-12-04T10:32:19.3886163Z * [new branch] gh/bobrenjc93/638/base -> origin/gh/bobrenjc93/638/base 2025-12-04T10:32:19.3886237Z * [new branch] gh/bobrenjc93/638/head -> origin/gh/bobrenjc93/638/head 2025-12-04T10:32:19.3886340Z * [new branch] gh/bobrenjc93/638/orig -> origin/gh/bobrenjc93/638/orig 2025-12-04T10:32:19.3886413Z * [new branch] gh/bobrenjc93/653/base -> origin/gh/bobrenjc93/653/base 2025-12-04T10:32:19.3886487Z * [new branch] gh/bobrenjc93/653/head -> origin/gh/bobrenjc93/653/head 2025-12-04T10:32:19.3886562Z * [new branch] gh/bobrenjc93/653/orig -> origin/gh/bobrenjc93/653/orig 2025-12-04T10:32:19.3886637Z * [new branch] gh/bobrenjc93/654/base -> origin/gh/bobrenjc93/654/base 2025-12-04T10:32:19.3886716Z * [new branch] gh/bobrenjc93/654/head -> origin/gh/bobrenjc93/654/head 2025-12-04T10:32:19.3886790Z * [new branch] gh/bobrenjc93/654/orig -> origin/gh/bobrenjc93/654/orig 2025-12-04T10:32:19.3886864Z * [new branch] gh/bobrenjc93/657/base -> origin/gh/bobrenjc93/657/base 2025-12-04T10:32:19.3886944Z * [new branch] gh/bobrenjc93/657/head -> origin/gh/bobrenjc93/657/head 2025-12-04T10:32:19.3887016Z * [new branch] gh/bobrenjc93/657/orig -> origin/gh/bobrenjc93/657/orig 2025-12-04T10:32:19.3887092Z * [new branch] gh/bobrenjc93/672/base -> origin/gh/bobrenjc93/672/base 2025-12-04T10:32:19.3887168Z * [new branch] gh/bobrenjc93/672/head -> origin/gh/bobrenjc93/672/head 2025-12-04T10:32:19.3887242Z * [new branch] gh/bobrenjc93/672/orig -> origin/gh/bobrenjc93/672/orig 2025-12-04T10:32:19.3887316Z * [new branch] gh/bobrenjc93/679/base -> origin/gh/bobrenjc93/679/base 2025-12-04T10:32:19.3887393Z * [new branch] gh/bobrenjc93/679/head -> origin/gh/bobrenjc93/679/head 2025-12-04T10:32:19.3887467Z * [new branch] gh/bobrenjc93/679/orig -> origin/gh/bobrenjc93/679/orig 2025-12-04T10:32:19.3887539Z * [new branch] gh/bobrenjc93/680/base -> origin/gh/bobrenjc93/680/base 2025-12-04T10:32:19.3887617Z * [new branch] gh/bobrenjc93/680/head -> origin/gh/bobrenjc93/680/head 2025-12-04T10:32:19.3887692Z * [new branch] gh/bobrenjc93/680/orig -> origin/gh/bobrenjc93/680/orig 2025-12-04T10:32:19.3887765Z * [new branch] gh/bobrenjc93/681/base -> origin/gh/bobrenjc93/681/base 2025-12-04T10:32:19.3887843Z * [new branch] gh/bobrenjc93/681/head -> origin/gh/bobrenjc93/681/head 2025-12-04T10:32:19.3887916Z * [new branch] gh/bobrenjc93/681/orig -> origin/gh/bobrenjc93/681/orig 2025-12-04T10:32:19.3887989Z * [new branch] gh/bobrenjc93/682/base -> origin/gh/bobrenjc93/682/base 2025-12-04T10:32:19.3888102Z * [new branch] gh/bobrenjc93/682/head -> origin/gh/bobrenjc93/682/head 2025-12-04T10:32:19.3888177Z * [new branch] gh/bobrenjc93/682/orig -> origin/gh/bobrenjc93/682/orig 2025-12-04T10:32:19.3888248Z * [new branch] gh/bobrenjc93/683/base -> origin/gh/bobrenjc93/683/base 2025-12-04T10:32:19.3888322Z * [new branch] gh/bobrenjc93/683/head -> origin/gh/bobrenjc93/683/head 2025-12-04T10:32:19.3888397Z * [new branch] gh/bobrenjc93/683/orig -> origin/gh/bobrenjc93/683/orig 2025-12-04T10:32:19.3888476Z * [new branch] gh/bobrenjc93/684/base -> origin/gh/bobrenjc93/684/base 2025-12-04T10:32:19.3888548Z * [new branch] gh/bobrenjc93/684/head -> origin/gh/bobrenjc93/684/head 2025-12-04T10:32:19.3888625Z * [new branch] gh/bobrenjc93/684/orig -> origin/gh/bobrenjc93/684/orig 2025-12-04T10:32:19.3888705Z * [new branch] gh/bobrenjc93/685/base -> origin/gh/bobrenjc93/685/base 2025-12-04T10:32:19.3888780Z * [new branch] gh/bobrenjc93/685/head -> origin/gh/bobrenjc93/685/head 2025-12-04T10:32:19.3888854Z * [new branch] gh/bobrenjc93/685/orig -> origin/gh/bobrenjc93/685/orig 2025-12-04T10:32:19.3888929Z * [new branch] gh/bobrenjc93/686/base -> origin/gh/bobrenjc93/686/base 2025-12-04T10:32:19.3889038Z * [new branch] gh/bobrenjc93/686/head -> origin/gh/bobrenjc93/686/head 2025-12-04T10:32:19.3889113Z * [new branch] gh/bobrenjc93/686/orig -> origin/gh/bobrenjc93/686/orig 2025-12-04T10:32:19.3889189Z * [new branch] gh/bobrenjc93/687/base -> origin/gh/bobrenjc93/687/base 2025-12-04T10:32:19.3889262Z * [new branch] gh/bobrenjc93/687/head -> origin/gh/bobrenjc93/687/head 2025-12-04T10:32:19.3889335Z * [new branch] gh/bobrenjc93/687/orig -> origin/gh/bobrenjc93/687/orig 2025-12-04T10:32:19.3889417Z * [new branch] gh/bobrenjc93/688/base -> origin/gh/bobrenjc93/688/base 2025-12-04T10:32:19.3889492Z * [new branch] gh/bobrenjc93/688/head -> origin/gh/bobrenjc93/688/head 2025-12-04T10:32:19.3889565Z * [new branch] gh/bobrenjc93/688/orig -> origin/gh/bobrenjc93/688/orig 2025-12-04T10:32:19.3889672Z * [new branch] gh/bobrenjc93/689/base -> origin/gh/bobrenjc93/689/base 2025-12-04T10:32:19.3889749Z * [new branch] gh/bobrenjc93/689/head -> origin/gh/bobrenjc93/689/head 2025-12-04T10:32:19.3889823Z * [new branch] gh/bobrenjc93/689/orig -> origin/gh/bobrenjc93/689/orig 2025-12-04T10:32:19.3889900Z * [new branch] gh/bobrenjc93/690/base -> origin/gh/bobrenjc93/690/base 2025-12-04T10:32:19.3889973Z * [new branch] gh/bobrenjc93/690/head -> origin/gh/bobrenjc93/690/head 2025-12-04T10:32:19.3890051Z * [new branch] gh/bobrenjc93/690/orig -> origin/gh/bobrenjc93/690/orig 2025-12-04T10:32:19.3890127Z * [new branch] gh/bobrenjc93/691/base -> origin/gh/bobrenjc93/691/base 2025-12-04T10:32:19.3890202Z * [new branch] gh/bobrenjc93/691/head -> origin/gh/bobrenjc93/691/head 2025-12-04T10:32:19.3890279Z * [new branch] gh/bobrenjc93/691/orig -> origin/gh/bobrenjc93/691/orig 2025-12-04T10:32:19.3890352Z * [new branch] gh/bobrenjc93/692/base -> origin/gh/bobrenjc93/692/base 2025-12-04T10:32:19.3890427Z * [new branch] gh/bobrenjc93/692/head -> origin/gh/bobrenjc93/692/head 2025-12-04T10:32:19.3890505Z * [new branch] gh/bobrenjc93/692/orig -> origin/gh/bobrenjc93/692/orig 2025-12-04T10:32:19.3890581Z * [new branch] gh/bobrenjc93/693/base -> origin/gh/bobrenjc93/693/base 2025-12-04T10:32:19.3890655Z * [new branch] gh/bobrenjc93/693/head -> origin/gh/bobrenjc93/693/head 2025-12-04T10:32:19.3890732Z * [new branch] gh/bobrenjc93/693/orig -> origin/gh/bobrenjc93/693/orig 2025-12-04T10:32:19.3890854Z * [new branch] gh/bobrenjc93/694/base -> origin/gh/bobrenjc93/694/base 2025-12-04T10:32:19.3890929Z * [new branch] gh/bobrenjc93/694/head -> origin/gh/bobrenjc93/694/head 2025-12-04T10:32:19.3891007Z * [new branch] gh/bobrenjc93/694/orig -> origin/gh/bobrenjc93/694/orig 2025-12-04T10:32:19.3891087Z * [new branch] gh/bobrenjc93/695/base -> origin/gh/bobrenjc93/695/base 2025-12-04T10:32:19.3891166Z * [new branch] gh/bobrenjc93/695/head -> origin/gh/bobrenjc93/695/head 2025-12-04T10:32:19.3891246Z * [new branch] gh/bobrenjc93/695/orig -> origin/gh/bobrenjc93/695/orig 2025-12-04T10:32:19.3891315Z * [new branch] gh/c00w/23/base -> origin/gh/c00w/23/base 2025-12-04T10:32:19.3891382Z * [new branch] gh/c00w/23/head -> origin/gh/c00w/23/head 2025-12-04T10:32:19.3891449Z * [new branch] gh/c00w/53/base -> origin/gh/c00w/53/base 2025-12-04T10:32:19.3891515Z * [new branch] gh/c00w/53/head -> origin/gh/c00w/53/head 2025-12-04T10:32:19.3891581Z * [new branch] gh/c00w/53/orig -> origin/gh/c00w/53/orig 2025-12-04T10:32:19.3891648Z * [new branch] gh/c00w/54/base -> origin/gh/c00w/54/base 2025-12-04T10:32:19.3891758Z * [new branch] gh/c00w/54/head -> origin/gh/c00w/54/head 2025-12-04T10:32:19.3891827Z * [new branch] gh/c00w/54/orig -> origin/gh/c00w/54/orig 2025-12-04T10:32:19.3891891Z * [new branch] gh/c00w/56/base -> origin/gh/c00w/56/base 2025-12-04T10:32:19.3891954Z * [new branch] gh/c00w/56/head -> origin/gh/c00w/56/head 2025-12-04T10:32:19.3892021Z * [new branch] gh/c00w/56/orig -> origin/gh/c00w/56/orig 2025-12-04T10:32:19.3892084Z * [new branch] gh/c00w/57/base -> origin/gh/c00w/57/base 2025-12-04T10:32:19.3892149Z * [new branch] gh/c00w/57/head -> origin/gh/c00w/57/head 2025-12-04T10:32:19.3892215Z * [new branch] gh/c00w/57/orig -> origin/gh/c00w/57/orig 2025-12-04T10:32:19.3892279Z * [new branch] gh/c00w/58/base -> origin/gh/c00w/58/base 2025-12-04T10:32:19.3892342Z * [new branch] gh/c00w/58/head -> origin/gh/c00w/58/head 2025-12-04T10:32:19.3892411Z * [new branch] gh/c00w/58/orig -> origin/gh/c00w/58/orig 2025-12-04T10:32:19.3892485Z * [new branch] gh/clee2000/1/base -> origin/gh/clee2000/1/base 2025-12-04T10:32:19.3892556Z * [new branch] gh/clee2000/1/head -> origin/gh/clee2000/1/head 2025-12-04T10:32:19.3892627Z * [new branch] gh/clee2000/1/orig -> origin/gh/clee2000/1/orig 2025-12-04T10:32:19.3892706Z * [new branch] gh/coconutruben/1/base -> origin/gh/coconutruben/1/base 2025-12-04T10:32:19.3892785Z * [new branch] gh/coconutruben/1/head -> origin/gh/coconutruben/1/head 2025-12-04T10:32:19.3892867Z * [new branch] gh/coconutruben/55/base -> origin/gh/coconutruben/55/base 2025-12-04T10:32:19.3892946Z * [new branch] gh/coconutruben/55/head -> origin/gh/coconutruben/55/head 2025-12-04T10:32:19.3893023Z * [new branch] gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig 2025-12-04T10:32:19.3893103Z * [new branch] gh/coconutruben/57/base -> origin/gh/coconutruben/57/base 2025-12-04T10:32:19.3893183Z * [new branch] gh/coconutruben/57/head -> origin/gh/coconutruben/57/head 2025-12-04T10:32:19.3893260Z * [new branch] gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig 2025-12-04T10:32:19.3893338Z * [new branch] gh/coconutruben/70/base -> origin/gh/coconutruben/70/base 2025-12-04T10:32:19.3893414Z * [new branch] gh/coconutruben/70/head -> origin/gh/coconutruben/70/head 2025-12-04T10:32:19.3893532Z * [new branch] gh/coconutruben/70/orig -> origin/gh/coconutruben/70/orig 2025-12-04T10:32:19.3893608Z * [new branch] gh/coconutruben/71/base -> origin/gh/coconutruben/71/base 2025-12-04T10:32:19.3893686Z * [new branch] gh/coconutruben/71/head -> origin/gh/coconutruben/71/head 2025-12-04T10:32:19.3893763Z * [new branch] gh/coconutruben/71/orig -> origin/gh/coconutruben/71/orig 2025-12-04T10:32:19.3893839Z * [new branch] gh/coconutruben/72/base -> origin/gh/coconutruben/72/base 2025-12-04T10:32:19.3893916Z * [new branch] gh/coconutruben/72/head -> origin/gh/coconutruben/72/head 2025-12-04T10:32:19.3893999Z * [new branch] gh/coconutruben/72/orig -> origin/gh/coconutruben/72/orig 2025-12-04T10:32:19.3894075Z * [new branch] gh/coconutruben/73/base -> origin/gh/coconutruben/73/base 2025-12-04T10:32:19.3894153Z * [new branch] gh/coconutruben/73/head -> origin/gh/coconutruben/73/head 2025-12-04T10:32:19.3894232Z * [new branch] gh/coconutruben/73/orig -> origin/gh/coconutruben/73/orig 2025-12-04T10:32:19.3894307Z * [new branch] gh/coconutruben/74/base -> origin/gh/coconutruben/74/base 2025-12-04T10:32:19.3894384Z * [new branch] gh/coconutruben/74/head -> origin/gh/coconutruben/74/head 2025-12-04T10:32:19.3894512Z * [new branch] gh/coconutruben/74/orig -> origin/gh/coconutruben/74/orig 2025-12-04T10:32:19.3894588Z * [new branch] gh/coconutruben/79/base -> origin/gh/coconutruben/79/base 2025-12-04T10:32:19.3894664Z * [new branch] gh/coconutruben/79/head -> origin/gh/coconutruben/79/head 2025-12-04T10:32:19.3894742Z * [new branch] gh/coconutruben/79/orig -> origin/gh/coconutruben/79/orig 2025-12-04T10:32:19.3894819Z * [new branch] gh/coconutruben/80/base -> origin/gh/coconutruben/80/base 2025-12-04T10:32:19.3894896Z * [new branch] gh/coconutruben/80/head -> origin/gh/coconutruben/80/head 2025-12-04T10:32:19.3894973Z * [new branch] gh/coconutruben/80/orig -> origin/gh/coconutruben/80/orig 2025-12-04T10:32:19.3895047Z * [new branch] gh/coconutruben/82/base -> origin/gh/coconutruben/82/base 2025-12-04T10:32:19.3895123Z * [new branch] gh/coconutruben/82/head -> origin/gh/coconutruben/82/head 2025-12-04T10:32:19.3895197Z * [new branch] gh/coconutruben/82/orig -> origin/gh/coconutruben/82/orig 2025-12-04T10:32:19.3895271Z * [new branch] gh/coconutruben/83/base -> origin/gh/coconutruben/83/base 2025-12-04T10:32:19.3895346Z * [new branch] gh/coconutruben/83/head -> origin/gh/coconutruben/83/head 2025-12-04T10:32:19.3895419Z * [new branch] gh/coconutruben/83/orig -> origin/gh/coconutruben/83/orig 2025-12-04T10:32:19.3895494Z * [new branch] gh/coconutruben/84/base -> origin/gh/coconutruben/84/base 2025-12-04T10:32:19.3895572Z * [new branch] gh/coconutruben/84/head -> origin/gh/coconutruben/84/head 2025-12-04T10:32:19.3895646Z * [new branch] gh/coconutruben/84/orig -> origin/gh/coconutruben/84/orig 2025-12-04T10:32:19.3895720Z * [new branch] gh/coconutruben/85/base -> origin/gh/coconutruben/85/base 2025-12-04T10:32:19.3895797Z * [new branch] gh/coconutruben/85/head -> origin/gh/coconutruben/85/head 2025-12-04T10:32:19.3895871Z * [new branch] gh/coconutruben/85/orig -> origin/gh/coconutruben/85/orig 2025-12-04T10:32:19.3895944Z * [new branch] gh/coconutruben/86/base -> origin/gh/coconutruben/86/base 2025-12-04T10:32:19.3896019Z * [new branch] gh/coconutruben/86/head -> origin/gh/coconutruben/86/head 2025-12-04T10:32:19.3896093Z * [new branch] gh/coconutruben/86/orig -> origin/gh/coconutruben/86/orig 2025-12-04T10:32:19.3896200Z * [new branch] gh/colinchan15/1/base -> origin/gh/colinchan15/1/base 2025-12-04T10:32:19.3896275Z * [new branch] gh/colinchan15/1/head -> origin/gh/colinchan15/1/head 2025-12-04T10:32:19.3896348Z * [new branch] gh/colinchan15/2/base -> origin/gh/colinchan15/2/base 2025-12-04T10:32:19.3896420Z * [new branch] gh/colinchan15/2/head -> origin/gh/colinchan15/2/head 2025-12-04T10:32:19.3896494Z * [new branch] gh/colinchan15/3/base -> origin/gh/colinchan15/3/base 2025-12-04T10:32:19.3896566Z * [new branch] gh/colinchan15/3/head -> origin/gh/colinchan15/3/head 2025-12-04T10:32:19.3896638Z * [new branch] gh/colinchan15/6/base -> origin/gh/colinchan15/6/base 2025-12-04T10:32:19.3896710Z * [new branch] gh/colinchan15/6/head -> origin/gh/colinchan15/6/head 2025-12-04T10:32:19.3896775Z * [new branch] gh/d4l3k/1/base -> origin/gh/d4l3k/1/base 2025-12-04T10:32:19.3896841Z * [new branch] gh/d4l3k/1/head -> origin/gh/d4l3k/1/head 2025-12-04T10:32:19.3896904Z * [new branch] gh/d4l3k/2/base -> origin/gh/d4l3k/2/base 2025-12-04T10:32:19.3896967Z * [new branch] gh/d4l3k/2/head -> origin/gh/d4l3k/2/head 2025-12-04T10:32:19.3897030Z * [new branch] gh/d4l3k/2/orig -> origin/gh/d4l3k/2/orig 2025-12-04T10:32:19.3897133Z * [new branch] gh/d4l3k/3/base -> origin/gh/d4l3k/3/base 2025-12-04T10:32:19.3897196Z * [new branch] gh/d4l3k/3/head -> origin/gh/d4l3k/3/head 2025-12-04T10:32:19.3897261Z * [new branch] gh/d4l3k/3/orig -> origin/gh/d4l3k/3/orig 2025-12-04T10:32:19.3897325Z * [new branch] gh/d4l3k/4/base -> origin/gh/d4l3k/4/base 2025-12-04T10:32:19.3897388Z * [new branch] gh/d4l3k/4/head -> origin/gh/d4l3k/4/head 2025-12-04T10:32:19.3897452Z * [new branch] gh/d4l3k/4/orig -> origin/gh/d4l3k/4/orig 2025-12-04T10:32:19.3897516Z * [new branch] gh/d4l3k/5/base -> origin/gh/d4l3k/5/base 2025-12-04T10:32:19.3897579Z * [new branch] gh/d4l3k/5/orig -> origin/gh/d4l3k/5/orig 2025-12-04T10:32:19.3897671Z * [new branch] gh/davidberard98/392/base -> origin/gh/davidberard98/392/base 2025-12-04T10:32:19.3897760Z * [new branch] gh/davidberard98/392/head -> origin/gh/davidberard98/392/head 2025-12-04T10:32:19.3897845Z * [new branch] gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig 2025-12-04T10:32:19.3897931Z * [new branch] gh/davidberard98/399/base -> origin/gh/davidberard98/399/base 2025-12-04T10:32:19.3898011Z * [new branch] gh/davidberard98/399/head -> origin/gh/davidberard98/399/head 2025-12-04T10:32:19.3898091Z * [new branch] gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig 2025-12-04T10:32:19.3898169Z * [new branch] gh/desertfire/605/base -> origin/gh/desertfire/605/base 2025-12-04T10:32:19.3898244Z * [new branch] gh/desertfire/605/head -> origin/gh/desertfire/605/head 2025-12-04T10:32:19.3898317Z * [new branch] gh/desertfire/605/orig -> origin/gh/desertfire/605/orig 2025-12-04T10:32:19.3898392Z * [new branch] gh/desertfire/606/base -> origin/gh/desertfire/606/base 2025-12-04T10:32:19.3898465Z * [new branch] gh/desertfire/606/head -> origin/gh/desertfire/606/head 2025-12-04T10:32:19.3898538Z * [new branch] gh/desertfire/606/orig -> origin/gh/desertfire/606/orig 2025-12-04T10:32:19.3898610Z * [new branch] gh/desertfire/607/base -> origin/gh/desertfire/607/base 2025-12-04T10:32:19.3898682Z * [new branch] gh/desertfire/607/head -> origin/gh/desertfire/607/head 2025-12-04T10:32:19.3898755Z * [new branch] gh/desertfire/607/orig -> origin/gh/desertfire/607/orig 2025-12-04T10:32:19.3898860Z * [new branch] gh/desertfire/608/base -> origin/gh/desertfire/608/base 2025-12-04T10:32:19.3898932Z * [new branch] gh/desertfire/608/head -> origin/gh/desertfire/608/head 2025-12-04T10:32:19.3899005Z * [new branch] gh/desertfire/608/orig -> origin/gh/desertfire/608/orig 2025-12-04T10:32:19.3899078Z * [new branch] gh/desertfire/609/base -> origin/gh/desertfire/609/base 2025-12-04T10:32:19.3899151Z * [new branch] gh/desertfire/609/head -> origin/gh/desertfire/609/head 2025-12-04T10:32:19.3899226Z * [new branch] gh/desertfire/609/orig -> origin/gh/desertfire/609/orig 2025-12-04T10:32:19.3899298Z * [new branch] gh/desertfire/610/base -> origin/gh/desertfire/610/base 2025-12-04T10:32:19.3899370Z * [new branch] gh/desertfire/610/head -> origin/gh/desertfire/610/head 2025-12-04T10:32:19.3899446Z * [new branch] gh/desertfire/610/orig -> origin/gh/desertfire/610/orig 2025-12-04T10:32:19.3899522Z * [new branch] gh/desertfire/611/base -> origin/gh/desertfire/611/base 2025-12-04T10:32:19.3899643Z * [new branch] gh/desertfire/611/head -> origin/gh/desertfire/611/head 2025-12-04T10:32:19.3899720Z * [new branch] gh/desertfire/611/orig -> origin/gh/desertfire/611/orig 2025-12-04T10:32:19.3899842Z * [new branch] gh/desertfire/612/base -> origin/gh/desertfire/612/base 2025-12-04T10:32:19.3899915Z * [new branch] gh/desertfire/612/head -> origin/gh/desertfire/612/head 2025-12-04T10:32:19.3899988Z * [new branch] gh/desertfire/612/orig -> origin/gh/desertfire/612/orig 2025-12-04T10:32:19.3900060Z * [new branch] gh/desertfire/613/base -> origin/gh/desertfire/613/base 2025-12-04T10:32:19.3900135Z * [new branch] gh/desertfire/613/head -> origin/gh/desertfire/613/head 2025-12-04T10:32:19.3900208Z * [new branch] gh/desertfire/613/orig -> origin/gh/desertfire/613/orig 2025-12-04T10:32:19.3900285Z * [new branch] gh/desertfire/614/base -> origin/gh/desertfire/614/base 2025-12-04T10:32:19.3900360Z * [new branch] gh/desertfire/614/head -> origin/gh/desertfire/614/head 2025-12-04T10:32:19.3900433Z * [new branch] gh/desertfire/614/orig -> origin/gh/desertfire/614/orig 2025-12-04T10:32:19.3900506Z * [new branch] gh/desertfire/615/base -> origin/gh/desertfire/615/base 2025-12-04T10:32:19.3900580Z * [new branch] gh/desertfire/615/head -> origin/gh/desertfire/615/head 2025-12-04T10:32:19.3900654Z * [new branch] gh/desertfire/615/orig -> origin/gh/desertfire/615/orig 2025-12-04T10:32:19.3900729Z * [new branch] gh/desertfire/616/base -> origin/gh/desertfire/616/base 2025-12-04T10:32:19.3900808Z * [new branch] gh/desertfire/616/head -> origin/gh/desertfire/616/head 2025-12-04T10:32:19.3900884Z * [new branch] gh/desertfire/616/orig -> origin/gh/desertfire/616/orig 2025-12-04T10:32:19.3900960Z * [new branch] gh/desertfire/617/base -> origin/gh/desertfire/617/base 2025-12-04T10:32:19.3901036Z * [new branch] gh/desertfire/617/head -> origin/gh/desertfire/617/head 2025-12-04T10:32:19.3901109Z * [new branch] gh/desertfire/617/orig -> origin/gh/desertfire/617/orig 2025-12-04T10:32:19.3901181Z * [new branch] gh/dharakk/1/base -> origin/gh/dharakk/1/base 2025-12-04T10:32:19.3901254Z * [new branch] gh/dharakk/1/head -> origin/gh/dharakk/1/head 2025-12-04T10:32:19.3901328Z * [new branch] gh/drisspg/170/base -> origin/gh/drisspg/170/base 2025-12-04T10:32:19.3901402Z * [new branch] gh/drisspg/170/head -> origin/gh/drisspg/170/head 2025-12-04T10:32:19.3901478Z * [new branch] gh/drisspg/170/orig -> origin/gh/drisspg/170/orig 2025-12-04T10:32:19.3901600Z * [new branch] gh/drisspg/182/base -> origin/gh/drisspg/182/base 2025-12-04T10:32:19.3901672Z * [new branch] gh/drisspg/182/head -> origin/gh/drisspg/182/head 2025-12-04T10:32:19.3901748Z * [new branch] gh/drisspg/183/base -> origin/gh/drisspg/183/base 2025-12-04T10:32:19.3901819Z * [new branch] gh/drisspg/183/head -> origin/gh/drisspg/183/head 2025-12-04T10:32:19.3901894Z * [new branch] gh/drisspg/184/base -> origin/gh/drisspg/184/base 2025-12-04T10:32:19.3901966Z * [new branch] gh/drisspg/184/head -> origin/gh/drisspg/184/head 2025-12-04T10:32:19.3902038Z * [new branch] gh/drisspg/185/base -> origin/gh/drisspg/185/base 2025-12-04T10:32:19.3902114Z * [new branch] gh/drisspg/185/head -> origin/gh/drisspg/185/head 2025-12-04T10:32:19.3902187Z * [new branch] gh/drisspg/194/base -> origin/gh/drisspg/194/base 2025-12-04T10:32:19.3902261Z * [new branch] gh/drisspg/194/head -> origin/gh/drisspg/194/head 2025-12-04T10:32:19.3902336Z * [new branch] gh/drisspg/194/orig -> origin/gh/drisspg/194/orig 2025-12-04T10:32:19.3902408Z * [new branch] gh/drisspg/200/base -> origin/gh/drisspg/200/base 2025-12-04T10:32:19.3902480Z * [new branch] gh/drisspg/200/head -> origin/gh/drisspg/200/head 2025-12-04T10:32:19.3902593Z * [new branch] gh/drisspg/200/orig -> origin/gh/drisspg/200/orig 2025-12-04T10:32:19.3902817Z * [new branch] gh/drisspg/218/base -> origin/gh/drisspg/218/base 2025-12-04T10:32:19.3902889Z * [new branch] gh/drisspg/218/head -> origin/gh/drisspg/218/head 2025-12-04T10:32:19.3902966Z * [new branch] gh/drisspg/218/orig -> origin/gh/drisspg/218/orig 2025-12-04T10:32:19.3903038Z * [new branch] gh/drisspg/219/base -> origin/gh/drisspg/219/base 2025-12-04T10:32:19.3903112Z * [new branch] gh/drisspg/219/head -> origin/gh/drisspg/219/head 2025-12-04T10:32:19.3903188Z * [new branch] gh/drisspg/219/orig -> origin/gh/drisspg/219/orig 2025-12-04T10:32:19.3903260Z * [new branch] gh/drisspg/220/base -> origin/gh/drisspg/220/base 2025-12-04T10:32:19.3903332Z * [new branch] gh/drisspg/220/head -> origin/gh/drisspg/220/head 2025-12-04T10:32:19.3903411Z * [new branch] gh/drisspg/220/orig -> origin/gh/drisspg/220/orig 2025-12-04T10:32:19.3903482Z * [new branch] gh/drisspg/221/base -> origin/gh/drisspg/221/base 2025-12-04T10:32:19.3903554Z * [new branch] gh/drisspg/221/head -> origin/gh/drisspg/221/head 2025-12-04T10:32:19.3903628Z * [new branch] gh/drisspg/221/orig -> origin/gh/drisspg/221/orig 2025-12-04T10:32:19.3903699Z * [new branch] gh/drisspg/222/base -> origin/gh/drisspg/222/base 2025-12-04T10:32:19.3903777Z * [new branch] gh/drisspg/222/head -> origin/gh/drisspg/222/head 2025-12-04T10:32:19.3903850Z * [new branch] gh/drisspg/222/orig -> origin/gh/drisspg/222/orig 2025-12-04T10:32:19.3903921Z * [new branch] gh/drisspg/223/base -> origin/gh/drisspg/223/base 2025-12-04T10:32:19.3903998Z * [new branch] gh/drisspg/223/head -> origin/gh/drisspg/223/head 2025-12-04T10:32:19.3904073Z * [new branch] gh/drisspg/223/orig -> origin/gh/drisspg/223/orig 2025-12-04T10:32:19.3904145Z * [new branch] gh/drisspg/224/base -> origin/gh/drisspg/224/base 2025-12-04T10:32:19.3904220Z * [new branch] gh/drisspg/224/head -> origin/gh/drisspg/224/head 2025-12-04T10:32:19.3904293Z * [new branch] gh/drisspg/224/orig -> origin/gh/drisspg/224/orig 2025-12-04T10:32:19.3904363Z * [new branch] gh/drisspg/225/base -> origin/gh/drisspg/225/base 2025-12-04T10:32:19.3904477Z * [new branch] gh/drisspg/225/head -> origin/gh/drisspg/225/head 2025-12-04T10:32:19.3904548Z * [new branch] gh/drisspg/225/orig -> origin/gh/drisspg/225/orig 2025-12-04T10:32:19.3904620Z * [new branch] gh/drisspg/226/base -> origin/gh/drisspg/226/base 2025-12-04T10:32:19.3904695Z * [new branch] gh/drisspg/226/head -> origin/gh/drisspg/226/head 2025-12-04T10:32:19.3904767Z * [new branch] gh/drisspg/226/orig -> origin/gh/drisspg/226/orig 2025-12-04T10:32:19.3904839Z * [new branch] gh/drisspg/227/base -> origin/gh/drisspg/227/base 2025-12-04T10:32:19.3904915Z * [new branch] gh/drisspg/227/head -> origin/gh/drisspg/227/head 2025-12-04T10:32:19.3904986Z * [new branch] gh/drisspg/227/orig -> origin/gh/drisspg/227/orig 2025-12-04T10:32:19.3905057Z * [new branch] gh/drisspg/228/base -> origin/gh/drisspg/228/base 2025-12-04T10:32:19.3905136Z * [new branch] gh/drisspg/228/head -> origin/gh/drisspg/228/head 2025-12-04T10:32:19.3905207Z * [new branch] gh/drisspg/228/orig -> origin/gh/drisspg/228/orig 2025-12-04T10:32:19.3905278Z * [new branch] gh/drisspg/229/base -> origin/gh/drisspg/229/base 2025-12-04T10:32:19.3905353Z * [new branch] gh/drisspg/229/head -> origin/gh/drisspg/229/head 2025-12-04T10:32:19.3905469Z * [new branch] gh/drisspg/229/orig -> origin/gh/drisspg/229/orig 2025-12-04T10:32:19.3905545Z * [new branch] gh/drisspg/230/base -> origin/gh/drisspg/230/base 2025-12-04T10:32:19.3905616Z * [new branch] gh/drisspg/230/head -> origin/gh/drisspg/230/head 2025-12-04T10:32:19.3905687Z * [new branch] gh/drisspg/230/orig -> origin/gh/drisspg/230/orig 2025-12-04T10:32:19.3905765Z * [new branch] gh/dsjohns2/1/base -> origin/gh/dsjohns2/1/base 2025-12-04T10:32:19.3905840Z * [new branch] gh/dsjohns2/1/head -> origin/gh/dsjohns2/1/head 2025-12-04T10:32:19.3905921Z * [new branch] gh/dzmitry-huba/1/base -> origin/gh/dzmitry-huba/1/base 2025-12-04T10:32:19.3906006Z * [new branch] gh/dzmitry-huba/1/head -> origin/gh/dzmitry-huba/1/head 2025-12-04T10:32:19.3906086Z * [new branch] gh/dzmitry-huba/12/base -> origin/gh/dzmitry-huba/12/base 2025-12-04T10:32:19.3906164Z * [new branch] gh/dzmitry-huba/12/head -> origin/gh/dzmitry-huba/12/head 2025-12-04T10:32:19.3906245Z * [new branch] gh/dzmitry-huba/12/orig -> origin/gh/dzmitry-huba/12/orig 2025-12-04T10:32:19.3906322Z * [new branch] gh/dzmitry-huba/13/base -> origin/gh/dzmitry-huba/13/base 2025-12-04T10:32:19.3906399Z * [new branch] gh/dzmitry-huba/13/head -> origin/gh/dzmitry-huba/13/head 2025-12-04T10:32:19.3906480Z * [new branch] gh/dzmitry-huba/13/orig -> origin/gh/dzmitry-huba/13/orig 2025-12-04T10:32:19.3906558Z * [new branch] gh/dzmitry-huba/14/base -> origin/gh/dzmitry-huba/14/base 2025-12-04T10:32:19.3906635Z * [new branch] gh/dzmitry-huba/14/head -> origin/gh/dzmitry-huba/14/head 2025-12-04T10:32:19.3906718Z * [new branch] gh/dzmitry-huba/14/orig -> origin/gh/dzmitry-huba/14/orig 2025-12-04T10:32:19.3906796Z * [new branch] gh/dzmitry-huba/15/base -> origin/gh/dzmitry-huba/15/base 2025-12-04T10:32:19.3906874Z * [new branch] gh/dzmitry-huba/15/head -> origin/gh/dzmitry-huba/15/head 2025-12-04T10:32:19.3906955Z * [new branch] gh/dzmitry-huba/15/orig -> origin/gh/dzmitry-huba/15/orig 2025-12-04T10:32:19.3907032Z * [new branch] gh/dzmitry-huba/16/base -> origin/gh/dzmitry-huba/16/base 2025-12-04T10:32:19.3907108Z * [new branch] gh/dzmitry-huba/16/head -> origin/gh/dzmitry-huba/16/head 2025-12-04T10:32:19.3907187Z * [new branch] gh/dzmitry-huba/16/orig -> origin/gh/dzmitry-huba/16/orig 2025-12-04T10:32:19.3907295Z * [new branch] gh/dzmitry-huba/17/base -> origin/gh/dzmitry-huba/17/base 2025-12-04T10:32:19.3907375Z * [new branch] gh/dzmitry-huba/17/head -> origin/gh/dzmitry-huba/17/head 2025-12-04T10:32:19.3907451Z * [new branch] gh/dzmitry-huba/17/orig -> origin/gh/dzmitry-huba/17/orig 2025-12-04T10:32:19.3907530Z * [new branch] gh/dzmitry-huba/2/base -> origin/gh/dzmitry-huba/2/base 2025-12-04T10:32:19.3907611Z * [new branch] gh/dzmitry-huba/2/head -> origin/gh/dzmitry-huba/2/head 2025-12-04T10:32:19.3907686Z * [new branch] gh/dzmitry-huba/3/base -> origin/gh/dzmitry-huba/3/base 2025-12-04T10:32:19.3907763Z * [new branch] gh/dzmitry-huba/3/head -> origin/gh/dzmitry-huba/3/head 2025-12-04T10:32:19.3907845Z * [new branch] gh/eellison/808/base -> origin/gh/eellison/808/base 2025-12-04T10:32:19.3907922Z * [new branch] gh/eellison/808/head -> origin/gh/eellison/808/head 2025-12-04T10:32:19.3907997Z * [new branch] gh/eellison/808/orig -> origin/gh/eellison/808/orig 2025-12-04T10:32:19.3908074Z * [new branch] gh/eellison/822/base -> origin/gh/eellison/822/base 2025-12-04T10:32:19.3908146Z * [new branch] gh/eellison/822/head -> origin/gh/eellison/822/head 2025-12-04T10:32:19.3908251Z * [new branch] gh/eellison/822/orig -> origin/gh/eellison/822/orig 2025-12-04T10:32:19.3908332Z * [new branch] gh/eellison/823/base -> origin/gh/eellison/823/base 2025-12-04T10:32:19.3908404Z * [new branch] gh/eellison/823/head -> origin/gh/eellison/823/head 2025-12-04T10:32:19.3908476Z * [new branch] gh/eellison/823/orig -> origin/gh/eellison/823/orig 2025-12-04T10:32:19.3908555Z * [new branch] gh/eellison/862/base -> origin/gh/eellison/862/base 2025-12-04T10:32:19.3908629Z * [new branch] gh/eellison/862/head -> origin/gh/eellison/862/head 2025-12-04T10:32:19.3908702Z * [new branch] gh/eellison/862/orig -> origin/gh/eellison/862/orig 2025-12-04T10:32:19.3908779Z * [new branch] gh/eellison/863/base -> origin/gh/eellison/863/base 2025-12-04T10:32:19.3908851Z * [new branch] gh/eellison/863/head -> origin/gh/eellison/863/head 2025-12-04T10:32:19.3908930Z * [new branch] gh/eellison/863/orig -> origin/gh/eellison/863/orig 2025-12-04T10:32:19.3909003Z * [new branch] gh/eellison/864/base -> origin/gh/eellison/864/base 2025-12-04T10:32:19.3909076Z * [new branch] gh/eellison/864/head -> origin/gh/eellison/864/head 2025-12-04T10:32:19.3909153Z * [new branch] gh/eellison/864/orig -> origin/gh/eellison/864/orig 2025-12-04T10:32:19.3909225Z * [new branch] gh/eellison/865/base -> origin/gh/eellison/865/base 2025-12-04T10:32:19.3909300Z * [new branch] gh/eellison/865/head -> origin/gh/eellison/865/head 2025-12-04T10:32:19.3909377Z * [new branch] gh/eellison/865/orig -> origin/gh/eellison/865/orig 2025-12-04T10:32:19.3909450Z * [new branch] gh/eellison/866/base -> origin/gh/eellison/866/base 2025-12-04T10:32:19.3909523Z * [new branch] gh/eellison/866/head -> origin/gh/eellison/866/head 2025-12-04T10:32:19.3909635Z * [new branch] gh/eellison/866/orig -> origin/gh/eellison/866/orig 2025-12-04T10:32:19.3909711Z * [new branch] gh/eellison/867/base -> origin/gh/eellison/867/base 2025-12-04T10:32:19.3909783Z * [new branch] gh/eellison/867/head -> origin/gh/eellison/867/head 2025-12-04T10:32:19.3909862Z * [new branch] gh/eellison/867/orig -> origin/gh/eellison/867/orig 2025-12-04T10:32:19.3909936Z * [new branch] gh/eellison/868/base -> origin/gh/eellison/868/base 2025-12-04T10:32:19.3910059Z * [new branch] gh/eellison/868/head -> origin/gh/eellison/868/head 2025-12-04T10:32:19.3910137Z * [new branch] gh/eellison/868/orig -> origin/gh/eellison/868/orig 2025-12-04T10:32:19.3910210Z * [new branch] gh/eellison/869/base -> origin/gh/eellison/869/base 2025-12-04T10:32:19.3910285Z * [new branch] gh/eellison/869/head -> origin/gh/eellison/869/head 2025-12-04T10:32:19.3910364Z * [new branch] gh/eellison/869/orig -> origin/gh/eellison/869/orig 2025-12-04T10:32:19.3910438Z * [new branch] gh/eellison/870/base -> origin/gh/eellison/870/base 2025-12-04T10:32:19.3910515Z * [new branch] gh/eellison/870/head -> origin/gh/eellison/870/head 2025-12-04T10:32:19.3910588Z * [new branch] gh/eellison/870/orig -> origin/gh/eellison/870/orig 2025-12-04T10:32:19.3910661Z * [new branch] gh/eellison/871/base -> origin/gh/eellison/871/base 2025-12-04T10:32:19.3910739Z * [new branch] gh/eellison/871/head -> origin/gh/eellison/871/head 2025-12-04T10:32:19.3910813Z * [new branch] gh/eellison/871/orig -> origin/gh/eellison/871/orig 2025-12-04T10:32:19.3910886Z * [new branch] gh/eellison/872/base -> origin/gh/eellison/872/base 2025-12-04T10:32:19.3911015Z * [new branch] gh/eellison/872/head -> origin/gh/eellison/872/head 2025-12-04T10:32:19.3911089Z * [new branch] gh/eellison/872/orig -> origin/gh/eellison/872/orig 2025-12-04T10:32:19.3911162Z * [new branch] gh/eellison/873/base -> origin/gh/eellison/873/base 2025-12-04T10:32:19.3911239Z * [new branch] gh/eellison/873/head -> origin/gh/eellison/873/head 2025-12-04T10:32:19.3911312Z * [new branch] gh/eellison/873/orig -> origin/gh/eellison/873/orig 2025-12-04T10:32:19.3911385Z * [new branch] gh/eellison/874/base -> origin/gh/eellison/874/base 2025-12-04T10:32:19.3911464Z * [new branch] gh/eellison/874/head -> origin/gh/eellison/874/head 2025-12-04T10:32:19.3911539Z * [new branch] gh/eellison/874/orig -> origin/gh/eellison/874/orig 2025-12-04T10:32:19.3911612Z * [new branch] gh/eellison/875/base -> origin/gh/eellison/875/base 2025-12-04T10:32:19.3911693Z * [new branch] gh/eellison/875/head -> origin/gh/eellison/875/head 2025-12-04T10:32:19.3911771Z * [new branch] gh/eellison/875/orig -> origin/gh/eellison/875/orig 2025-12-04T10:32:19.3911843Z * [new branch] gh/eellison/876/base -> origin/gh/eellison/876/base 2025-12-04T10:32:19.3911919Z * [new branch] gh/eellison/876/head -> origin/gh/eellison/876/head 2025-12-04T10:32:19.3911993Z * [new branch] gh/eellison/876/orig -> origin/gh/eellison/876/orig 2025-12-04T10:32:19.3912066Z * [new branch] gh/eellison/877/base -> origin/gh/eellison/877/base 2025-12-04T10:32:19.3912145Z * [new branch] gh/eellison/877/head -> origin/gh/eellison/877/head 2025-12-04T10:32:19.3912221Z * [new branch] gh/eellison/877/orig -> origin/gh/eellison/877/orig 2025-12-04T10:32:19.3912298Z * [new branch] gh/eellison/878/base -> origin/gh/eellison/878/base 2025-12-04T10:32:19.3912372Z * [new branch] gh/eellison/878/head -> origin/gh/eellison/878/head 2025-12-04T10:32:19.3912445Z * [new branch] gh/eellison/878/orig -> origin/gh/eellison/878/orig 2025-12-04T10:32:19.3912522Z * [new branch] gh/eellison/879/base -> origin/gh/eellison/879/base 2025-12-04T10:32:19.3912595Z * [new branch] gh/eellison/879/head -> origin/gh/eellison/879/head 2025-12-04T10:32:19.3912667Z * [new branch] gh/eellison/879/orig -> origin/gh/eellison/879/orig 2025-12-04T10:32:19.3912744Z * [new branch] gh/eellison/880/base -> origin/gh/eellison/880/base 2025-12-04T10:32:19.3912847Z * [new branch] gh/eellison/880/head -> origin/gh/eellison/880/head 2025-12-04T10:32:19.3912920Z * [new branch] gh/eellison/880/orig -> origin/gh/eellison/880/orig 2025-12-04T10:32:19.3912996Z * [new branch] gh/eellison/881/base -> origin/gh/eellison/881/base 2025-12-04T10:32:19.3913071Z * [new branch] gh/eellison/881/head -> origin/gh/eellison/881/head 2025-12-04T10:32:19.3913145Z * [new branch] gh/eellison/881/orig -> origin/gh/eellison/881/orig 2025-12-04T10:32:19.3913221Z * [new branch] gh/eellison/882/base -> origin/gh/eellison/882/base 2025-12-04T10:32:19.3913294Z * [new branch] gh/eellison/882/head -> origin/gh/eellison/882/head 2025-12-04T10:32:19.3913367Z * [new branch] gh/eellison/882/orig -> origin/gh/eellison/882/orig 2025-12-04T10:32:19.3913443Z * [new branch] gh/eellison/883/base -> origin/gh/eellison/883/base 2025-12-04T10:32:19.3913517Z * [new branch] gh/eellison/883/head -> origin/gh/eellison/883/head 2025-12-04T10:32:19.3913591Z * [new branch] gh/eellison/883/orig -> origin/gh/eellison/883/orig 2025-12-04T10:32:19.3913669Z * [new branch] gh/eellison/884/base -> origin/gh/eellison/884/base 2025-12-04T10:32:19.3913774Z * [new branch] gh/eellison/884/head -> origin/gh/eellison/884/head 2025-12-04T10:32:19.3913852Z * [new branch] gh/eellison/884/orig -> origin/gh/eellison/884/orig 2025-12-04T10:32:19.3913921Z * [new branch] gh/etaf/147/base -> origin/gh/etaf/147/base 2025-12-04T10:32:19.3913990Z * [new branch] gh/etaf/147/head -> origin/gh/etaf/147/head 2025-12-04T10:32:19.3914065Z * [new branch] gh/etaf/154/base -> origin/gh/etaf/154/base 2025-12-04T10:32:19.3914133Z * [new branch] gh/etaf/154/head -> origin/gh/etaf/154/head 2025-12-04T10:32:19.3914202Z * [new branch] gh/etaf/154/orig -> origin/gh/etaf/154/orig 2025-12-04T10:32:19.3914274Z * [new branch] gh/etaf/156/base -> origin/gh/etaf/156/base 2025-12-04T10:32:19.3914340Z * [new branch] gh/etaf/156/head -> origin/gh/etaf/156/head 2025-12-04T10:32:19.3914412Z * [new branch] gh/etaf/156/orig -> origin/gh/etaf/156/orig 2025-12-04T10:32:19.3914483Z * [new branch] gh/etaf/157/base -> origin/gh/etaf/157/base 2025-12-04T10:32:19.3914550Z * [new branch] gh/etaf/157/head -> origin/gh/etaf/157/head 2025-12-04T10:32:19.3914616Z * [new branch] gh/etaf/157/orig -> origin/gh/etaf/157/orig 2025-12-04T10:32:19.3914687Z * [new branch] gh/etaf/158/base -> origin/gh/etaf/158/base 2025-12-04T10:32:19.3914753Z * [new branch] gh/etaf/158/head -> origin/gh/etaf/158/head 2025-12-04T10:32:19.3914820Z * [new branch] gh/etaf/158/orig -> origin/gh/etaf/158/orig 2025-12-04T10:32:19.3914892Z * [new branch] gh/etaf/159/base -> origin/gh/etaf/159/base 2025-12-04T10:32:19.3914958Z * [new branch] gh/etaf/159/head -> origin/gh/etaf/159/head 2025-12-04T10:32:19.3915027Z * [new branch] gh/etaf/159/orig -> origin/gh/etaf/159/orig 2025-12-04T10:32:19.3915098Z * [new branch] gh/etaf/160/base -> origin/gh/etaf/160/base 2025-12-04T10:32:19.3915164Z * [new branch] gh/etaf/160/head -> origin/gh/etaf/160/head 2025-12-04T10:32:19.3915229Z * [new branch] gh/etaf/160/orig -> origin/gh/etaf/160/orig 2025-12-04T10:32:19.3915300Z * [new branch] gh/etaf/161/base -> origin/gh/etaf/161/base 2025-12-04T10:32:19.3915366Z * [new branch] gh/etaf/161/head -> origin/gh/etaf/161/head 2025-12-04T10:32:19.3915481Z * [new branch] gh/etaf/161/orig -> origin/gh/etaf/161/orig 2025-12-04T10:32:19.3915547Z * [new branch] gh/etaf/166/base -> origin/gh/etaf/166/base 2025-12-04T10:32:19.3915612Z * [new branch] gh/etaf/166/head -> origin/gh/etaf/166/head 2025-12-04T10:32:19.3915682Z * [new branch] gh/etaf/166/orig -> origin/gh/etaf/166/orig 2025-12-04T10:32:19.3915749Z * [new branch] gh/etaf/167/base -> origin/gh/etaf/167/base 2025-12-04T10:32:19.3915815Z * [new branch] gh/etaf/167/head -> origin/gh/etaf/167/head 2025-12-04T10:32:19.3915885Z * [new branch] gh/etaf/167/orig -> origin/gh/etaf/167/orig 2025-12-04T10:32:19.3915951Z * [new branch] gh/etaf/168/base -> origin/gh/etaf/168/base 2025-12-04T10:32:19.3916018Z * [new branch] gh/etaf/168/head -> origin/gh/etaf/168/head 2025-12-04T10:32:19.3916092Z * [new branch] gh/etaf/168/orig -> origin/gh/etaf/168/orig 2025-12-04T10:32:19.3916158Z * [new branch] gh/etaf/172/base -> origin/gh/etaf/172/base 2025-12-04T10:32:19.3916225Z * [new branch] gh/etaf/172/head -> origin/gh/etaf/172/head 2025-12-04T10:32:19.3916296Z * [new branch] gh/etaf/172/orig -> origin/gh/etaf/172/orig 2025-12-04T10:32:19.3916399Z * [new branch] gh/etaf/173/base -> origin/gh/etaf/173/base 2025-12-04T10:32:19.3916466Z * [new branch] gh/etaf/173/head -> origin/gh/etaf/173/head 2025-12-04T10:32:19.3916537Z * [new branch] gh/etaf/173/orig -> origin/gh/etaf/173/orig 2025-12-04T10:32:19.3916603Z * [new branch] gh/etaf/174/base -> origin/gh/etaf/174/base 2025-12-04T10:32:19.3916670Z * [new branch] gh/etaf/174/head -> origin/gh/etaf/174/head 2025-12-04T10:32:19.3916745Z * [new branch] gh/etaf/175/base -> origin/gh/etaf/175/base 2025-12-04T10:32:19.3916811Z * [new branch] gh/etaf/175/head -> origin/gh/etaf/175/head 2025-12-04T10:32:19.3916879Z * [new branch] gh/etaf/175/orig -> origin/gh/etaf/175/orig 2025-12-04T10:32:19.3916949Z * [new branch] gh/etaf/176/base -> origin/gh/etaf/176/base 2025-12-04T10:32:19.3917018Z * [new branch] gh/etaf/176/head -> origin/gh/etaf/176/head 2025-12-04T10:32:19.3917090Z * [new branch] gh/etaf/176/orig -> origin/gh/etaf/176/orig 2025-12-04T10:32:19.3917157Z * [new branch] gh/etaf/177/base -> origin/gh/etaf/177/base 2025-12-04T10:32:19.3917224Z * [new branch] gh/etaf/177/head -> origin/gh/etaf/177/head 2025-12-04T10:32:19.3917295Z * [new branch] gh/etaf/177/orig -> origin/gh/etaf/177/orig 2025-12-04T10:32:19.3917362Z * [new branch] gh/etaf/178/base -> origin/gh/etaf/178/base 2025-12-04T10:32:19.3917431Z * [new branch] gh/etaf/178/head -> origin/gh/etaf/178/head 2025-12-04T10:32:19.3917502Z * [new branch] gh/etaf/178/orig -> origin/gh/etaf/178/orig 2025-12-04T10:32:19.3917569Z * [new branch] gh/etaf/179/base -> origin/gh/etaf/179/base 2025-12-04T10:32:19.3917637Z * [new branch] gh/etaf/179/head -> origin/gh/etaf/179/head 2025-12-04T10:32:19.3917708Z * [new branch] gh/etaf/179/orig -> origin/gh/etaf/179/orig 2025-12-04T10:32:19.3917774Z * [new branch] gh/etaf/180/base -> origin/gh/etaf/180/base 2025-12-04T10:32:19.3917841Z * [new branch] gh/etaf/180/head -> origin/gh/etaf/180/head 2025-12-04T10:32:19.3917912Z * [new branch] gh/etaf/180/orig -> origin/gh/etaf/180/orig 2025-12-04T10:32:19.3917994Z * [new branch] gh/exclamaforte/1/base -> origin/gh/exclamaforte/1/base 2025-12-04T10:32:19.3918107Z * [new branch] gh/exclamaforte/1/head -> origin/gh/exclamaforte/1/head 2025-12-04T10:32:19.3918190Z * [new branch] gh/exclamaforte/2/base -> origin/gh/exclamaforte/2/base 2025-12-04T10:32:19.3918268Z * [new branch] gh/exclamaforte/2/head -> origin/gh/exclamaforte/2/head 2025-12-04T10:32:19.3918346Z * [new branch] gh/exclamaforte/3/base -> origin/gh/exclamaforte/3/base 2025-12-04T10:32:19.3918427Z * [new branch] gh/exclamaforte/3/head -> origin/gh/exclamaforte/3/head 2025-12-04T10:32:19.3918504Z * [new branch] gh/exclamaforte/4/base -> origin/gh/exclamaforte/4/base 2025-12-04T10:32:19.3918580Z * [new branch] gh/exclamaforte/4/head -> origin/gh/exclamaforte/4/head 2025-12-04T10:32:19.3918657Z * [new branch] gh/ezyang/2374/base -> origin/gh/ezyang/2374/base 2025-12-04T10:32:19.3918729Z * [new branch] gh/ezyang/2374/head -> origin/gh/ezyang/2374/head 2025-12-04T10:32:19.3918807Z * [new branch] gh/ezyang/2374/orig -> origin/gh/ezyang/2374/orig 2025-12-04T10:32:19.3918879Z * [new branch] gh/ezyang/2973/base -> origin/gh/ezyang/2973/base 2025-12-04T10:32:19.3918950Z * [new branch] gh/ezyang/2973/head -> origin/gh/ezyang/2973/head 2025-12-04T10:32:19.3919052Z * [new branch] gh/ezyang/2973/orig -> origin/gh/ezyang/2973/orig 2025-12-04T10:32:19.3919123Z * [new branch] gh/ezyang/2974/base -> origin/gh/ezyang/2974/base 2025-12-04T10:32:19.3919194Z * [new branch] gh/ezyang/2974/head -> origin/gh/ezyang/2974/head 2025-12-04T10:32:19.3919270Z * [new branch] gh/ezyang/2974/orig -> origin/gh/ezyang/2974/orig 2025-12-04T10:32:19.3919340Z * [new branch] gh/ezyang/3131/base -> origin/gh/ezyang/3131/base 2025-12-04T10:32:19.3919409Z * [new branch] gh/ezyang/3131/head -> origin/gh/ezyang/3131/head 2025-12-04T10:32:19.3919485Z * [new branch] gh/ezyang/3131/orig -> origin/gh/ezyang/3131/orig 2025-12-04T10:32:19.3919555Z * [new branch] gh/ezyang/3139/base -> origin/gh/ezyang/3139/base 2025-12-04T10:32:19.3919666Z * [new branch] gh/ezyang/3139/head -> origin/gh/ezyang/3139/head 2025-12-04T10:32:19.3919739Z * [new branch] gh/ezyang/3139/orig -> origin/gh/ezyang/3139/orig 2025-12-04T10:32:19.3919809Z * [new branch] gh/ezyang/3140/base -> origin/gh/ezyang/3140/base 2025-12-04T10:32:19.3919878Z * [new branch] gh/ezyang/3140/head -> origin/gh/ezyang/3140/head 2025-12-04T10:32:19.3919953Z * [new branch] gh/ezyang/3140/orig -> origin/gh/ezyang/3140/orig 2025-12-04T10:32:19.3920022Z * [new branch] gh/ezyang/3143/base -> origin/gh/ezyang/3143/base 2025-12-04T10:32:19.3920091Z * [new branch] gh/ezyang/3143/head -> origin/gh/ezyang/3143/head 2025-12-04T10:32:19.3920166Z * [new branch] gh/ezyang/3143/orig -> origin/gh/ezyang/3143/orig 2025-12-04T10:32:19.3920237Z * [new branch] gh/ezyang/3144/base -> origin/gh/ezyang/3144/base 2025-12-04T10:32:19.3920305Z * [new branch] gh/ezyang/3144/head -> origin/gh/ezyang/3144/head 2025-12-04T10:32:19.3920379Z * [new branch] gh/ezyang/3144/orig -> origin/gh/ezyang/3144/orig 2025-12-04T10:32:19.3920449Z * [new branch] gh/ezyang/3167/base -> origin/gh/ezyang/3167/base 2025-12-04T10:32:19.3920518Z * [new branch] gh/ezyang/3167/head -> origin/gh/ezyang/3167/head 2025-12-04T10:32:19.3920592Z * [new branch] gh/ezyang/3167/orig -> origin/gh/ezyang/3167/orig 2025-12-04T10:32:19.3920660Z * [new branch] gh/ezyang/3173/base -> origin/gh/ezyang/3173/base 2025-12-04T10:32:19.3920732Z * [new branch] gh/ezyang/3173/head -> origin/gh/ezyang/3173/head 2025-12-04T10:32:19.3920852Z * [new branch] gh/ezyang/3173/orig -> origin/gh/ezyang/3173/orig 2025-12-04T10:32:19.3920922Z * [new branch] gh/ezyang/3175/base -> origin/gh/ezyang/3175/base 2025-12-04T10:32:19.3920995Z * [new branch] gh/ezyang/3175/head -> origin/gh/ezyang/3175/head 2025-12-04T10:32:19.3921064Z * [new branch] gh/ezyang/3175/orig -> origin/gh/ezyang/3175/orig 2025-12-04T10:32:19.3921133Z * [new branch] gh/ezyang/3182/base -> origin/gh/ezyang/3182/base 2025-12-04T10:32:19.3921206Z * [new branch] gh/ezyang/3182/head -> origin/gh/ezyang/3182/head 2025-12-04T10:32:19.3921275Z * [new branch] gh/ezyang/3182/orig -> origin/gh/ezyang/3182/orig 2025-12-04T10:32:19.3921345Z * [new branch] gh/ezyang/3185/base -> origin/gh/ezyang/3185/base 2025-12-04T10:32:19.3921417Z * [new branch] gh/ezyang/3185/head -> origin/gh/ezyang/3185/head 2025-12-04T10:32:19.3921488Z * [new branch] gh/ezyang/3185/orig -> origin/gh/ezyang/3185/orig 2025-12-04T10:32:19.3921557Z * [new branch] gh/ezyang/3189/base -> origin/gh/ezyang/3189/base 2025-12-04T10:32:19.3921631Z * [new branch] gh/ezyang/3189/head -> origin/gh/ezyang/3189/head 2025-12-04T10:32:19.3921751Z * [new branch] gh/ezyang/3189/orig -> origin/gh/ezyang/3189/orig 2025-12-04T10:32:19.3921821Z * [new branch] gh/ezyang/3191/base -> origin/gh/ezyang/3191/base 2025-12-04T10:32:19.3921894Z * [new branch] gh/ezyang/3191/head -> origin/gh/ezyang/3191/head 2025-12-04T10:32:19.3921964Z * [new branch] gh/ezyang/3191/orig -> origin/gh/ezyang/3191/orig 2025-12-04T10:32:19.3922033Z * [new branch] gh/ezyang/3192/base -> origin/gh/ezyang/3192/base 2025-12-04T10:32:19.3922107Z * [new branch] gh/ezyang/3192/head -> origin/gh/ezyang/3192/head 2025-12-04T10:32:19.3922177Z * [new branch] gh/ezyang/3192/orig -> origin/gh/ezyang/3192/orig 2025-12-04T10:32:19.3922247Z * [new branch] gh/ezyang/3193/base -> origin/gh/ezyang/3193/base 2025-12-04T10:32:19.3922322Z * [new branch] gh/ezyang/3193/head -> origin/gh/ezyang/3193/head 2025-12-04T10:32:19.3922390Z * [new branch] gh/ezyang/3193/orig -> origin/gh/ezyang/3193/orig 2025-12-04T10:32:19.3922464Z * [new branch] gh/ezyang/3194/base -> origin/gh/ezyang/3194/base 2025-12-04T10:32:19.3922534Z * [new branch] gh/ezyang/3194/head -> origin/gh/ezyang/3194/head 2025-12-04T10:32:19.3922602Z * [new branch] gh/ezyang/3194/orig -> origin/gh/ezyang/3194/orig 2025-12-04T10:32:19.3922676Z * [new branch] gh/ezyang/3195/base -> origin/gh/ezyang/3195/base 2025-12-04T10:32:19.3922746Z * [new branch] gh/ezyang/3195/head -> origin/gh/ezyang/3195/head 2025-12-04T10:32:19.3922817Z * [new branch] gh/ezyang/3195/orig -> origin/gh/ezyang/3195/orig 2025-12-04T10:32:19.3922890Z * [new branch] gh/ezyang/3196/base -> origin/gh/ezyang/3196/base 2025-12-04T10:32:19.3922960Z * [new branch] gh/ezyang/3196/head -> origin/gh/ezyang/3196/head 2025-12-04T10:32:19.3923030Z * [new branch] gh/ezyang/3196/orig -> origin/gh/ezyang/3196/orig 2025-12-04T10:32:19.3923102Z * [new branch] gh/ezyang/3197/base -> origin/gh/ezyang/3197/base 2025-12-04T10:32:19.3923172Z * [new branch] gh/ezyang/3197/head -> origin/gh/ezyang/3197/head 2025-12-04T10:32:19.3923242Z * [new branch] gh/ezyang/3197/orig -> origin/gh/ezyang/3197/orig 2025-12-04T10:32:19.3923314Z * [new branch] gh/ezyang/3198/base -> origin/gh/ezyang/3198/base 2025-12-04T10:32:19.3933200Z * [new branch] gh/ezyang/3198/head -> origin/gh/ezyang/3198/head 2025-12-04T10:32:19.3933372Z * [new branch] gh/ezyang/3198/orig -> origin/gh/ezyang/3198/orig 2025-12-04T10:32:19.3933455Z * [new branch] gh/ezyang/3199/base -> origin/gh/ezyang/3199/base 2025-12-04T10:32:19.3933528Z * [new branch] gh/ezyang/3199/head -> origin/gh/ezyang/3199/head 2025-12-04T10:32:19.3933602Z * [new branch] gh/ezyang/3199/orig -> origin/gh/ezyang/3199/orig 2025-12-04T10:32:19.3933683Z * [new branch] gh/ezyang/3200/base -> origin/gh/ezyang/3200/base 2025-12-04T10:32:19.3933756Z * [new branch] gh/ezyang/3200/head -> origin/gh/ezyang/3200/head 2025-12-04T10:32:19.3933827Z * [new branch] gh/ezyang/3200/orig -> origin/gh/ezyang/3200/orig 2025-12-04T10:32:19.3933901Z * [new branch] gh/ezyang/3201/base -> origin/gh/ezyang/3201/base 2025-12-04T10:32:19.3933974Z * [new branch] gh/ezyang/3201/head -> origin/gh/ezyang/3201/head 2025-12-04T10:32:19.3934045Z * [new branch] gh/ezyang/3201/orig -> origin/gh/ezyang/3201/orig 2025-12-04T10:32:19.3934117Z * [new branch] gh/ezyang/3202/base -> origin/gh/ezyang/3202/base 2025-12-04T10:32:19.3934191Z * [new branch] gh/ezyang/3202/head -> origin/gh/ezyang/3202/head 2025-12-04T10:32:19.3934318Z * [new branch] gh/ezyang/3202/orig -> origin/gh/ezyang/3202/orig 2025-12-04T10:32:19.3934392Z * [new branch] gh/ezyang/3203/base -> origin/gh/ezyang/3203/base 2025-12-04T10:32:19.3934462Z * [new branch] gh/ezyang/3203/head -> origin/gh/ezyang/3203/head 2025-12-04T10:32:19.3934530Z * [new branch] gh/ezyang/3203/orig -> origin/gh/ezyang/3203/orig 2025-12-04T10:32:19.3934598Z * [new branch] gh/ezyang/3204/base -> origin/gh/ezyang/3204/base 2025-12-04T10:32:19.3934669Z * [new branch] gh/ezyang/3204/head -> origin/gh/ezyang/3204/head 2025-12-04T10:32:19.3934737Z * [new branch] gh/ezyang/3204/orig -> origin/gh/ezyang/3204/orig 2025-12-04T10:32:19.3934804Z * [new branch] gh/ezyang/3205/base -> origin/gh/ezyang/3205/base 2025-12-04T10:32:19.3934875Z * [new branch] gh/ezyang/3205/head -> origin/gh/ezyang/3205/head 2025-12-04T10:32:19.3934945Z * [new branch] gh/ezyang/3205/orig -> origin/gh/ezyang/3205/orig 2025-12-04T10:32:19.3935014Z * [new branch] gh/ezyang/3206/base -> origin/gh/ezyang/3206/base 2025-12-04T10:32:19.3935086Z * [new branch] gh/ezyang/3206/head -> origin/gh/ezyang/3206/head 2025-12-04T10:32:19.3935160Z * [new branch] gh/ezyang/3206/orig -> origin/gh/ezyang/3206/orig 2025-12-04T10:32:19.3935229Z * [new branch] gh/ezyang/3207/base -> origin/gh/ezyang/3207/base 2025-12-04T10:32:19.3935303Z * [new branch] gh/ezyang/3207/head -> origin/gh/ezyang/3207/head 2025-12-04T10:32:19.3935381Z * [new branch] gh/ezyang/3207/orig -> origin/gh/ezyang/3207/orig 2025-12-04T10:32:19.3935451Z * [new branch] gh/ezyang/3208/base -> origin/gh/ezyang/3208/base 2025-12-04T10:32:19.3935521Z * [new branch] gh/ezyang/3208/head -> origin/gh/ezyang/3208/head 2025-12-04T10:32:19.3935600Z * [new branch] gh/ezyang/3208/orig -> origin/gh/ezyang/3208/orig 2025-12-04T10:32:19.3935669Z * [new branch] gh/ezyang/3209/base -> origin/gh/ezyang/3209/base 2025-12-04T10:32:19.3935738Z * [new branch] gh/ezyang/3209/head -> origin/gh/ezyang/3209/head 2025-12-04T10:32:19.3935809Z * [new branch] gh/ezyang/3209/orig -> origin/gh/ezyang/3209/orig 2025-12-04T10:32:19.3935883Z * [new branch] gh/fadara01/3/base -> origin/gh/fadara01/3/base 2025-12-04T10:32:19.3936003Z * [new branch] gh/fadara01/3/head -> origin/gh/fadara01/3/head 2025-12-04T10:32:19.3936073Z * [new branch] gh/fadara01/3/orig -> origin/gh/fadara01/3/orig 2025-12-04T10:32:19.3936143Z * [new branch] gh/fadara01/5/base -> origin/gh/fadara01/5/base 2025-12-04T10:32:19.3936216Z * [new branch] gh/fadara01/5/head -> origin/gh/fadara01/5/head 2025-12-04T10:32:19.3936289Z * [new branch] gh/fadara01/5/orig -> origin/gh/fadara01/5/orig 2025-12-04T10:32:19.3936359Z * [new branch] gh/fadara01/6/base -> origin/gh/fadara01/6/base 2025-12-04T10:32:19.3936433Z * [new branch] gh/fadara01/6/head -> origin/gh/fadara01/6/head 2025-12-04T10:32:19.3936503Z * [new branch] gh/fadara01/6/orig -> origin/gh/fadara01/6/orig 2025-12-04T10:32:19.3936573Z * [new branch] gh/fadara01/7/base -> origin/gh/fadara01/7/base 2025-12-04T10:32:19.3936647Z * [new branch] gh/fadara01/7/head -> origin/gh/fadara01/7/head 2025-12-04T10:32:19.3936719Z * [new branch] gh/fadara01/7/orig -> origin/gh/fadara01/7/orig 2025-12-04T10:32:19.3936790Z * [new branch] gh/fadara01/8/base -> origin/gh/fadara01/8/base 2025-12-04T10:32:19.3936863Z * [new branch] gh/fadara01/8/head -> origin/gh/fadara01/8/head 2025-12-04T10:32:19.3936962Z * [new branch] gh/fadara01/8/orig -> origin/gh/fadara01/8/orig 2025-12-04T10:32:19.3937030Z * [new branch] gh/fadara01/9/base -> origin/gh/fadara01/9/base 2025-12-04T10:32:19.3937102Z * [new branch] gh/fadara01/9/head -> origin/gh/fadara01/9/head 2025-12-04T10:32:19.3937170Z * [new branch] gh/fadara01/9/orig -> origin/gh/fadara01/9/orig 2025-12-04T10:32:19.3937239Z * [new branch] gh/fduwjj/182/base -> origin/gh/fduwjj/182/base 2025-12-04T10:32:19.3937312Z * [new branch] gh/fduwjj/182/head -> origin/gh/fduwjj/182/head 2025-12-04T10:32:19.3937386Z * [new branch] gh/fduwjj/182/orig -> origin/gh/fduwjj/182/orig 2025-12-04T10:32:19.3937455Z * [new branch] gh/fduwjj/211/base -> origin/gh/fduwjj/211/base 2025-12-04T10:32:19.3937527Z * [new branch] gh/fduwjj/211/head -> origin/gh/fduwjj/211/head 2025-12-04T10:32:19.3937599Z * [new branch] gh/fduwjj/211/orig -> origin/gh/fduwjj/211/orig 2025-12-04T10:32:19.3937669Z * [new branch] gh/fduwjj/212/base -> origin/gh/fduwjj/212/base 2025-12-04T10:32:19.3937744Z * [new branch] gh/fduwjj/212/head -> origin/gh/fduwjj/212/head 2025-12-04T10:32:19.3937812Z * [new branch] gh/fduwjj/212/orig -> origin/gh/fduwjj/212/orig 2025-12-04T10:32:19.3937882Z * [new branch] gh/fduwjj/213/base -> origin/gh/fduwjj/213/base 2025-12-04T10:32:19.3937951Z * [new branch] gh/fduwjj/213/head -> origin/gh/fduwjj/213/head 2025-12-04T10:32:19.3938022Z * [new branch] gh/fduwjj/213/orig -> origin/gh/fduwjj/213/orig 2025-12-04T10:32:19.3938093Z * [new branch] gh/fduwjj/226/base -> origin/gh/fduwjj/226/base 2025-12-04T10:32:19.3938160Z * [new branch] gh/fduwjj/226/head -> origin/gh/fduwjj/226/head 2025-12-04T10:32:19.3938228Z * [new branch] gh/fduwjj/226/orig -> origin/gh/fduwjj/226/orig 2025-12-04T10:32:19.3938302Z * [new branch] gh/fduwjj/229/base -> origin/gh/fduwjj/229/base 2025-12-04T10:32:19.3938372Z * [new branch] gh/fduwjj/229/head -> origin/gh/fduwjj/229/head 2025-12-04T10:32:19.3938440Z * [new branch] gh/fduwjj/229/orig -> origin/gh/fduwjj/229/orig 2025-12-04T10:32:19.3938508Z * [new branch] gh/fduwjj/233/base -> origin/gh/fduwjj/233/base 2025-12-04T10:32:19.3938577Z * [new branch] gh/fduwjj/233/head -> origin/gh/fduwjj/233/head 2025-12-04T10:32:19.3938677Z * [new branch] gh/fduwjj/233/orig -> origin/gh/fduwjj/233/orig 2025-12-04T10:32:19.3938747Z * [new branch] gh/fduwjj/234/base -> origin/gh/fduwjj/234/base 2025-12-04T10:32:19.3938813Z * [new branch] gh/fduwjj/234/head -> origin/gh/fduwjj/234/head 2025-12-04T10:32:19.3938882Z * [new branch] gh/fduwjj/234/orig -> origin/gh/fduwjj/234/orig 2025-12-04T10:32:19.3938952Z * [new branch] gh/fduwjj/235/base -> origin/gh/fduwjj/235/base 2025-12-04T10:32:19.3939019Z * [new branch] gh/fduwjj/235/head -> origin/gh/fduwjj/235/head 2025-12-04T10:32:19.3939086Z * [new branch] gh/fduwjj/235/orig -> origin/gh/fduwjj/235/orig 2025-12-04T10:32:19.3939154Z * [new branch] gh/fduwjj/236/base -> origin/gh/fduwjj/236/base 2025-12-04T10:32:19.3939220Z * [new branch] gh/fduwjj/236/head -> origin/gh/fduwjj/236/head 2025-12-04T10:32:19.3939288Z * [new branch] gh/fduwjj/236/orig -> origin/gh/fduwjj/236/orig 2025-12-04T10:32:19.3939358Z * [new branch] gh/fduwjj/237/base -> origin/gh/fduwjj/237/base 2025-12-04T10:32:19.3939426Z * [new branch] gh/fduwjj/237/head -> origin/gh/fduwjj/237/head 2025-12-04T10:32:19.3940040Z * [new branch] gh/fduwjj/237/orig -> origin/gh/fduwjj/237/orig 2025-12-04T10:32:19.3940115Z * [new branch] gh/fduwjj/238/base -> origin/gh/fduwjj/238/base 2025-12-04T10:32:19.3940184Z * [new branch] gh/fduwjj/238/head -> origin/gh/fduwjj/238/head 2025-12-04T10:32:19.3940256Z * [new branch] gh/fduwjj/238/orig -> origin/gh/fduwjj/238/orig 2025-12-04T10:32:19.3940323Z * [new branch] gh/fduwjj/239/base -> origin/gh/fduwjj/239/base 2025-12-04T10:32:19.3940390Z * [new branch] gh/fduwjj/239/head -> origin/gh/fduwjj/239/head 2025-12-04T10:32:19.3940462Z * [new branch] gh/fduwjj/239/orig -> origin/gh/fduwjj/239/orig 2025-12-04T10:32:19.3940534Z * [new branch] gh/fegin/332/base -> origin/gh/fegin/332/base 2025-12-04T10:32:19.3940602Z * [new branch] gh/fegin/332/head -> origin/gh/fegin/332/head 2025-12-04T10:32:19.3940670Z * [new branch] gh/fegin/332/orig -> origin/gh/fegin/332/orig 2025-12-04T10:32:19.3940737Z * [new branch] gh/fegin/333/base -> origin/gh/fegin/333/base 2025-12-04T10:32:19.3940803Z * [new branch] gh/fegin/333/head -> origin/gh/fegin/333/head 2025-12-04T10:32:19.3940871Z * [new branch] gh/fegin/333/orig -> origin/gh/fegin/333/orig 2025-12-04T10:32:19.3940938Z * [new branch] gh/fegin/334/base -> origin/gh/fegin/334/base 2025-12-04T10:32:19.3941003Z * [new branch] gh/fegin/334/head -> origin/gh/fegin/334/head 2025-12-04T10:32:19.3941077Z * [new branch] gh/fegin/334/orig -> origin/gh/fegin/334/orig 2025-12-04T10:32:19.3941143Z * [new branch] gh/fegin/335/base -> origin/gh/fegin/335/base 2025-12-04T10:32:19.3941209Z * [new branch] gh/fegin/335/head -> origin/gh/fegin/335/head 2025-12-04T10:32:19.3941279Z * [new branch] gh/fegin/335/orig -> origin/gh/fegin/335/orig 2025-12-04T10:32:19.3941349Z * [new branch] gh/fffrog/160/base -> origin/gh/fffrog/160/base 2025-12-04T10:32:19.3941418Z * [new branch] gh/fffrog/160/head -> origin/gh/fffrog/160/head 2025-12-04T10:32:19.3941486Z * [new branch] gh/fffrog/177/base -> origin/gh/fffrog/177/base 2025-12-04T10:32:19.3941553Z * [new branch] gh/fffrog/177/head -> origin/gh/fffrog/177/head 2025-12-04T10:32:19.3941620Z * [new branch] gh/fffrog/177/orig -> origin/gh/fffrog/177/orig 2025-12-04T10:32:19.3941731Z * [new branch] gh/fffrog/178/base -> origin/gh/fffrog/178/base 2025-12-04T10:32:19.3941799Z * [new branch] gh/fffrog/178/head -> origin/gh/fffrog/178/head 2025-12-04T10:32:19.3941868Z * [new branch] gh/fffrog/178/orig -> origin/gh/fffrog/178/orig 2025-12-04T10:32:19.3941935Z * [new branch] gh/fffrog/181/base -> origin/gh/fffrog/181/base 2025-12-04T10:32:19.3942003Z * [new branch] gh/fffrog/181/head -> origin/gh/fffrog/181/head 2025-12-04T10:32:19.3942076Z * [new branch] gh/fffrog/181/orig -> origin/gh/fffrog/181/orig 2025-12-04T10:32:19.3942142Z * [new branch] gh/fffrog/183/base -> origin/gh/fffrog/183/base 2025-12-04T10:32:19.3942210Z * [new branch] gh/fffrog/183/head -> origin/gh/fffrog/183/head 2025-12-04T10:32:19.3942280Z * [new branch] gh/fffrog/183/orig -> origin/gh/fffrog/183/orig 2025-12-04T10:32:19.3942350Z * [new branch] gh/fxdawnn/10/base -> origin/gh/fxdawnn/10/base 2025-12-04T10:32:19.3942419Z * [new branch] gh/fxdawnn/10/head -> origin/gh/fxdawnn/10/head 2025-12-04T10:32:19.3942489Z * [new branch] gh/fxdawnn/10/orig -> origin/gh/fxdawnn/10/orig 2025-12-04T10:32:19.3942557Z * [new branch] gh/fxdawnn/11/base -> origin/gh/fxdawnn/11/base 2025-12-04T10:32:19.3942660Z * [new branch] gh/fxdawnn/11/head -> origin/gh/fxdawnn/11/head 2025-12-04T10:32:19.3942732Z * [new branch] gh/fxdawnn/11/orig -> origin/gh/fxdawnn/11/orig 2025-12-04T10:32:19.3942800Z * [new branch] gh/fxdawnn/12/base -> origin/gh/fxdawnn/12/base 2025-12-04T10:32:19.3942867Z * [new branch] gh/fxdawnn/12/head -> origin/gh/fxdawnn/12/head 2025-12-04T10:32:19.3942938Z * [new branch] gh/fxdawnn/12/orig -> origin/gh/fxdawnn/12/orig 2025-12-04T10:32:19.3943008Z * [new branch] gh/fxdawnn/13/base -> origin/gh/fxdawnn/13/base 2025-12-04T10:32:19.3943075Z * [new branch] gh/fxdawnn/13/head -> origin/gh/fxdawnn/13/head 2025-12-04T10:32:19.3943144Z * [new branch] gh/fxdawnn/13/orig -> origin/gh/fxdawnn/13/orig 2025-12-04T10:32:19.3943212Z * [new branch] gh/fxdawnn/14/base -> origin/gh/fxdawnn/14/base 2025-12-04T10:32:19.3943281Z * [new branch] gh/fxdawnn/14/head -> origin/gh/fxdawnn/14/head 2025-12-04T10:32:19.3943350Z * [new branch] gh/fxdawnn/14/orig -> origin/gh/fxdawnn/14/orig 2025-12-04T10:32:19.3943417Z * [new branch] gh/fxdawnn/15/base -> origin/gh/fxdawnn/15/base 2025-12-04T10:32:19.3943484Z * [new branch] gh/fxdawnn/15/head -> origin/gh/fxdawnn/15/head 2025-12-04T10:32:19.3943552Z * [new branch] gh/fxdawnn/15/orig -> origin/gh/fxdawnn/15/orig 2025-12-04T10:32:19.3943624Z * [new branch] gh/fxdawnn/6/base -> origin/gh/fxdawnn/6/base 2025-12-04T10:32:19.3943694Z * [new branch] gh/fxdawnn/6/head -> origin/gh/fxdawnn/6/head 2025-12-04T10:32:19.3943761Z * [new branch] gh/fxdawnn/6/orig -> origin/gh/fxdawnn/6/orig 2025-12-04T10:32:19.3943828Z * [new branch] gh/fxdawnn/7/base -> origin/gh/fxdawnn/7/base 2025-12-04T10:32:19.3943897Z * [new branch] gh/fxdawnn/7/head -> origin/gh/fxdawnn/7/head 2025-12-04T10:32:19.3943962Z * [new branch] gh/fxdawnn/7/orig -> origin/gh/fxdawnn/7/orig 2025-12-04T10:32:19.3944028Z * [new branch] gh/fxdawnn/9/base -> origin/gh/fxdawnn/9/base 2025-12-04T10:32:19.3944095Z * [new branch] gh/fxdawnn/9/head -> origin/gh/fxdawnn/9/head 2025-12-04T10:32:19.3944160Z * [new branch] gh/fxdawnn/9/orig -> origin/gh/fxdawnn/9/orig 2025-12-04T10:32:19.3944261Z * [new branch] gh/galv/1/base -> origin/gh/galv/1/base 2025-12-04T10:32:19.3944329Z * [new branch] gh/galv/1/head -> origin/gh/galv/1/head 2025-12-04T10:32:19.3944393Z * [new branch] gh/galv/1/orig -> origin/gh/galv/1/orig 2025-12-04T10:32:19.3944455Z * [new branch] gh/galv/2/base -> origin/gh/galv/2/base 2025-12-04T10:32:19.3944521Z * [new branch] gh/galv/2/head -> origin/gh/galv/2/head 2025-12-04T10:32:19.3944584Z * [new branch] gh/galv/2/orig -> origin/gh/galv/2/orig 2025-12-04T10:32:19.3944648Z * [new branch] gh/galv/3/base -> origin/gh/galv/3/base 2025-12-04T10:32:19.3944711Z * [new branch] gh/galv/3/head -> origin/gh/galv/3/head 2025-12-04T10:32:19.3944775Z * [new branch] gh/galv/3/orig -> origin/gh/galv/3/orig 2025-12-04T10:32:19.3944854Z * [new branch] gh/guangyey/134/base -> origin/gh/guangyey/134/base 2025-12-04T10:32:19.3944932Z * [new branch] gh/guangyey/134/head -> origin/gh/guangyey/134/head 2025-12-04T10:32:19.3945003Z * [new branch] gh/guangyey/134/orig -> origin/gh/guangyey/134/orig 2025-12-04T10:32:19.3945075Z * [new branch] gh/guangyey/163/base -> origin/gh/guangyey/163/base 2025-12-04T10:32:19.3945187Z * [new branch] gh/guangyey/163/head -> origin/gh/guangyey/163/head 2025-12-04T10:32:19.3945258Z * [new branch] gh/guangyey/163/orig -> origin/gh/guangyey/163/orig 2025-12-04T10:32:19.3945330Z * [new branch] gh/guangyey/168/base -> origin/gh/guangyey/168/base 2025-12-04T10:32:19.3945402Z * [new branch] gh/guangyey/168/head -> origin/gh/guangyey/168/head 2025-12-04T10:32:19.3945473Z * [new branch] gh/guangyey/168/orig -> origin/gh/guangyey/168/orig 2025-12-04T10:32:19.3945548Z * [new branch] gh/guangyey/169/base -> origin/gh/guangyey/169/base 2025-12-04T10:32:19.3945625Z * [new branch] gh/guangyey/169/head -> origin/gh/guangyey/169/head 2025-12-04T10:32:19.3945694Z * [new branch] gh/guangyey/169/orig -> origin/gh/guangyey/169/orig 2025-12-04T10:32:19.3945768Z * [new branch] gh/guangyey/170/base -> origin/gh/guangyey/170/base 2025-12-04T10:32:19.3945840Z * [new branch] gh/guangyey/170/head -> origin/gh/guangyey/170/head 2025-12-04T10:32:19.3945911Z * [new branch] gh/guangyey/170/orig -> origin/gh/guangyey/170/orig 2025-12-04T10:32:19.3945984Z * [new branch] gh/guangyey/171/base -> origin/gh/guangyey/171/base 2025-12-04T10:32:19.3946055Z * [new branch] gh/guangyey/171/head -> origin/gh/guangyey/171/head 2025-12-04T10:32:19.3946126Z * [new branch] gh/guangyey/171/orig -> origin/gh/guangyey/171/orig 2025-12-04T10:32:19.3946199Z * [new branch] gh/guangyey/178/base -> origin/gh/guangyey/178/base 2025-12-04T10:32:19.3946272Z * [new branch] gh/guangyey/178/head -> origin/gh/guangyey/178/head 2025-12-04T10:32:19.3946344Z * [new branch] gh/guangyey/178/orig -> origin/gh/guangyey/178/orig 2025-12-04T10:32:19.3946418Z * [new branch] gh/guangyey/182/base -> origin/gh/guangyey/182/base 2025-12-04T10:32:19.3946488Z * [new branch] gh/guangyey/182/head -> origin/gh/guangyey/182/head 2025-12-04T10:32:19.3946559Z * [new branch] gh/guangyey/182/orig -> origin/gh/guangyey/182/orig 2025-12-04T10:32:19.3946631Z * [new branch] gh/guangyey/183/base -> origin/gh/guangyey/183/base 2025-12-04T10:32:19.3946702Z * [new branch] gh/guangyey/183/head -> origin/gh/guangyey/183/head 2025-12-04T10:32:19.3946772Z * [new branch] gh/guangyey/183/orig -> origin/gh/guangyey/183/orig 2025-12-04T10:32:19.3946910Z * [new branch] gh/guangyey/185/base -> origin/gh/guangyey/185/base 2025-12-04T10:32:19.3946979Z * [new branch] gh/guangyey/185/head -> origin/gh/guangyey/185/head 2025-12-04T10:32:19.3947051Z * [new branch] gh/guangyey/185/orig -> origin/gh/guangyey/185/orig 2025-12-04T10:32:19.3947122Z * [new branch] gh/guangyey/186/base -> origin/gh/guangyey/186/base 2025-12-04T10:32:19.3947194Z * [new branch] gh/guangyey/186/head -> origin/gh/guangyey/186/head 2025-12-04T10:32:19.3947268Z * [new branch] gh/guangyey/186/orig -> origin/gh/guangyey/186/orig 2025-12-04T10:32:19.3947340Z * [new branch] gh/guangyey/187/base -> origin/gh/guangyey/187/base 2025-12-04T10:32:19.3947410Z * [new branch] gh/guangyey/187/head -> origin/gh/guangyey/187/head 2025-12-04T10:32:19.3947483Z * [new branch] gh/guangyey/187/orig -> origin/gh/guangyey/187/orig 2025-12-04T10:32:19.3947558Z * [new branch] gh/guangyey/188/base -> origin/gh/guangyey/188/base 2025-12-04T10:32:19.3947628Z * [new branch] gh/guangyey/188/head -> origin/gh/guangyey/188/head 2025-12-04T10:32:19.3947699Z * [new branch] gh/guangyey/188/orig -> origin/gh/guangyey/188/orig 2025-12-04T10:32:19.3947770Z * [new branch] gh/guangyey/190/base -> origin/gh/guangyey/190/base 2025-12-04T10:32:19.3947867Z * [new branch] gh/guangyey/190/head -> origin/gh/guangyey/190/head 2025-12-04T10:32:19.3947941Z * [new branch] gh/guangyey/190/orig -> origin/gh/guangyey/190/orig 2025-12-04T10:32:19.3948011Z * [new branch] gh/guangyey/208/base -> origin/gh/guangyey/208/base 2025-12-04T10:32:19.3948082Z * [new branch] gh/guangyey/208/head -> origin/gh/guangyey/208/head 2025-12-04T10:32:19.3948153Z * [new branch] gh/guangyey/208/orig -> origin/gh/guangyey/208/orig 2025-12-04T10:32:19.3948225Z * [new branch] gh/guangyey/228/base -> origin/gh/guangyey/228/base 2025-12-04T10:32:19.3948295Z * [new branch] gh/guangyey/228/head -> origin/gh/guangyey/228/head 2025-12-04T10:32:19.3948368Z * [new branch] gh/guangyey/228/orig -> origin/gh/guangyey/228/orig 2025-12-04T10:32:19.3948439Z * [new branch] gh/guangyey/230/base -> origin/gh/guangyey/230/base 2025-12-04T10:32:19.3948512Z * [new branch] gh/guangyey/230/head -> origin/gh/guangyey/230/head 2025-12-04T10:32:19.3948582Z * [new branch] gh/guangyey/230/orig -> origin/gh/guangyey/230/orig 2025-12-04T10:32:19.3948653Z * [new branch] gh/guangyey/231/base -> origin/gh/guangyey/231/base 2025-12-04T10:32:19.3948728Z * [new branch] gh/guangyey/231/head -> origin/gh/guangyey/231/head 2025-12-04T10:32:19.3948799Z * [new branch] gh/guangyey/231/orig -> origin/gh/guangyey/231/orig 2025-12-04T10:32:19.3948870Z * [new branch] gh/guangyey/232/base -> origin/gh/guangyey/232/base 2025-12-04T10:32:19.3948942Z * [new branch] gh/guangyey/232/head -> origin/gh/guangyey/232/head 2025-12-04T10:32:19.3949012Z * [new branch] gh/guangyey/232/orig -> origin/gh/guangyey/232/orig 2025-12-04T10:32:19.3949081Z * [new branch] gh/guangyey/233/base -> origin/gh/guangyey/233/base 2025-12-04T10:32:19.3949156Z * [new branch] gh/guangyey/233/head -> origin/gh/guangyey/233/head 2025-12-04T10:32:19.3949226Z * [new branch] gh/guangyey/233/orig -> origin/gh/guangyey/233/orig 2025-12-04T10:32:19.3949297Z * [new branch] gh/guangyey/234/base -> origin/gh/guangyey/234/base 2025-12-04T10:32:19.3949368Z * [new branch] gh/guangyey/234/head -> origin/gh/guangyey/234/head 2025-12-04T10:32:19.3949438Z * [new branch] gh/guangyey/234/orig -> origin/gh/guangyey/234/orig 2025-12-04T10:32:19.3949539Z * [new branch] gh/guangyey/235/base -> origin/gh/guangyey/235/base 2025-12-04T10:32:19.3949640Z * [new branch] gh/guangyey/235/head -> origin/gh/guangyey/235/head 2025-12-04T10:32:19.3949714Z * [new branch] gh/guangyey/235/orig -> origin/gh/guangyey/235/orig 2025-12-04T10:32:19.3949784Z * [new branch] gh/guangyey/236/base -> origin/gh/guangyey/236/base 2025-12-04T10:32:19.3949859Z * [new branch] gh/guangyey/236/head -> origin/gh/guangyey/236/head 2025-12-04T10:32:19.3949930Z * [new branch] gh/guangyey/236/orig -> origin/gh/guangyey/236/orig 2025-12-04T10:32:19.3950003Z * [new branch] gh/guangyey/237/base -> origin/gh/guangyey/237/base 2025-12-04T10:32:19.3950074Z * [new branch] gh/guangyey/237/head -> origin/gh/guangyey/237/head 2025-12-04T10:32:19.3950145Z * [new branch] gh/guangyey/237/orig -> origin/gh/guangyey/237/orig 2025-12-04T10:32:19.3950222Z * [new branch] gh/guangyey/238/base -> origin/gh/guangyey/238/base 2025-12-04T10:32:19.3950293Z * [new branch] gh/guangyey/238/head -> origin/gh/guangyey/238/head 2025-12-04T10:32:19.3950364Z * [new branch] gh/guangyey/239/base -> origin/gh/guangyey/239/base 2025-12-04T10:32:19.3950501Z * [new branch] gh/guangyey/239/head -> origin/gh/guangyey/239/head 2025-12-04T10:32:19.3950571Z * [new branch] gh/guangyey/239/orig -> origin/gh/guangyey/239/orig 2025-12-04T10:32:19.3950642Z * [new branch] gh/guangyey/240/base -> origin/gh/guangyey/240/base 2025-12-04T10:32:19.3950717Z * [new branch] gh/guangyey/240/head -> origin/gh/guangyey/240/head 2025-12-04T10:32:19.3950789Z * [new branch] gh/guangyey/240/orig -> origin/gh/guangyey/240/orig 2025-12-04T10:32:19.3950860Z * [new branch] gh/guangyey/241/base -> origin/gh/guangyey/241/base 2025-12-04T10:32:19.3950934Z * [new branch] gh/guangyey/241/head -> origin/gh/guangyey/241/head 2025-12-04T10:32:19.3951004Z * [new branch] gh/guangyey/241/orig -> origin/gh/guangyey/241/orig 2025-12-04T10:32:19.3951074Z * [new branch] gh/guangyey/242/base -> origin/gh/guangyey/242/base 2025-12-04T10:32:19.3951148Z * [new branch] gh/guangyey/242/head -> origin/gh/guangyey/242/head 2025-12-04T10:32:19.3951219Z * [new branch] gh/guangyey/242/orig -> origin/gh/guangyey/242/orig 2025-12-04T10:32:19.3951288Z * [new branch] gh/guangyey/243/base -> origin/gh/guangyey/243/base 2025-12-04T10:32:19.3951363Z * [new branch] gh/guangyey/243/head -> origin/gh/guangyey/243/head 2025-12-04T10:32:19.3951435Z * [new branch] gh/guangyey/243/orig -> origin/gh/guangyey/243/orig 2025-12-04T10:32:19.3951509Z * [new branch] gh/guangyey/244/base -> origin/gh/guangyey/244/base 2025-12-04T10:32:19.3951583Z * [new branch] gh/guangyey/244/head -> origin/gh/guangyey/244/head 2025-12-04T10:32:19.3951655Z * [new branch] gh/guangyey/244/orig -> origin/gh/guangyey/244/orig 2025-12-04T10:32:19.3951726Z * [new branch] gh/guangyey/245/base -> origin/gh/guangyey/245/base 2025-12-04T10:32:19.3951797Z * [new branch] gh/guangyey/245/head -> origin/gh/guangyey/245/head 2025-12-04T10:32:19.3951867Z * [new branch] gh/guangyey/245/orig -> origin/gh/guangyey/245/orig 2025-12-04T10:32:19.3951938Z * [new branch] gh/guangyey/246/base -> origin/gh/guangyey/246/base 2025-12-04T10:32:19.3952008Z * [new branch] gh/guangyey/246/head -> origin/gh/guangyey/246/head 2025-12-04T10:32:19.3952078Z * [new branch] gh/guangyey/246/orig -> origin/gh/guangyey/246/orig 2025-12-04T10:32:19.3952149Z * [new branch] gh/guangyey/247/base -> origin/gh/guangyey/247/base 2025-12-04T10:32:19.3952272Z * [new branch] gh/guangyey/247/head -> origin/gh/guangyey/247/head 2025-12-04T10:32:19.3952342Z * [new branch] gh/guangyey/247/orig -> origin/gh/guangyey/247/orig 2025-12-04T10:32:19.3952413Z * [new branch] gh/guangyey/248/base -> origin/gh/guangyey/248/base 2025-12-04T10:32:19.3952484Z * [new branch] gh/guangyey/248/head -> origin/gh/guangyey/248/head 2025-12-04T10:32:19.3952555Z * [new branch] gh/guangyey/248/orig -> origin/gh/guangyey/248/orig 2025-12-04T10:32:19.3952628Z * [new branch] gh/guangyey/249/base -> origin/gh/guangyey/249/base 2025-12-04T10:32:19.3952698Z * [new branch] gh/guangyey/249/head -> origin/gh/guangyey/249/head 2025-12-04T10:32:19.3952766Z * [new branch] gh/guangyey/249/orig -> origin/gh/guangyey/249/orig 2025-12-04T10:32:19.3952841Z * [new branch] gh/guangyey/250/base -> origin/gh/guangyey/250/base 2025-12-04T10:32:19.3952914Z * [new branch] gh/guangyey/250/head -> origin/gh/guangyey/250/head 2025-12-04T10:32:19.3952987Z * [new branch] gh/guangyey/250/orig -> origin/gh/guangyey/250/orig 2025-12-04T10:32:19.3953059Z * [new branch] gh/guangyey/251/base -> origin/gh/guangyey/251/base 2025-12-04T10:32:19.3953158Z * [new branch] gh/guangyey/251/head -> origin/gh/guangyey/251/head 2025-12-04T10:32:19.3953230Z * [new branch] gh/guangyey/251/orig -> origin/gh/guangyey/251/orig 2025-12-04T10:32:19.3953301Z * [new branch] gh/guangyey/252/base -> origin/gh/guangyey/252/base 2025-12-04T10:32:19.3953371Z * [new branch] gh/guangyey/252/head -> origin/gh/guangyey/252/head 2025-12-04T10:32:19.3953442Z * [new branch] gh/guangyey/252/orig -> origin/gh/guangyey/252/orig 2025-12-04T10:32:19.3953513Z * [new branch] gh/guangyey/253/base -> origin/gh/guangyey/253/base 2025-12-04T10:32:19.3953585Z * [new branch] gh/guangyey/253/head -> origin/gh/guangyey/253/head 2025-12-04T10:32:19.3953658Z * [new branch] gh/guangyey/253/orig -> origin/gh/guangyey/253/orig 2025-12-04T10:32:19.3953729Z * [new branch] gh/guangyey/254/base -> origin/gh/guangyey/254/base 2025-12-04T10:32:19.3953801Z * [new branch] gh/guangyey/254/head -> origin/gh/guangyey/254/head 2025-12-04T10:32:19.3953876Z * [new branch] gh/guangyey/254/orig -> origin/gh/guangyey/254/orig 2025-12-04T10:32:19.3953947Z * [new branch] gh/guangyey/255/base -> origin/gh/guangyey/255/base 2025-12-04T10:32:19.3954018Z * [new branch] gh/guangyey/255/head -> origin/gh/guangyey/255/head 2025-12-04T10:32:19.3954091Z * [new branch] gh/guangyey/255/orig -> origin/gh/guangyey/255/orig 2025-12-04T10:32:19.3954190Z * [new branch] gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base 2025-12-04T10:32:19.3954283Z * [new branch] gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head 2025-12-04T10:32:19.3954375Z * [new branch] gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig 2025-12-04T10:32:19.3954464Z * [new branch] gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base 2025-12-04T10:32:19.3954556Z * [new branch] gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head 2025-12-04T10:32:19.3954645Z * [new branch] gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig 2025-12-04T10:32:19.3954733Z * [new branch] gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base 2025-12-04T10:32:19.3954822Z * [new branch] gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head 2025-12-04T10:32:19.3954910Z * [new branch] gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig 2025-12-04T10:32:19.3955029Z * [new branch] gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base 2025-12-04T10:32:19.3955120Z * [new branch] gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head 2025-12-04T10:32:19.3955208Z * [new branch] gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig 2025-12-04T10:32:19.3955296Z * [new branch] gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base 2025-12-04T10:32:19.3955388Z * [new branch] gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head 2025-12-04T10:32:19.3955476Z * [new branch] gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig 2025-12-04T10:32:19.3955564Z * [new branch] gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base 2025-12-04T10:32:19.3955653Z * [new branch] gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head 2025-12-04T10:32:19.3955743Z * [new branch] gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig 2025-12-04T10:32:19.3955830Z * [new branch] gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base 2025-12-04T10:32:19.3955917Z * [new branch] gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head 2025-12-04T10:32:19.3956045Z * [new branch] gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig 2025-12-04T10:32:19.3956135Z * [new branch] gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base 2025-12-04T10:32:19.3956227Z * [new branch] gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head 2025-12-04T10:32:19.3956314Z * [new branch] gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig 2025-12-04T10:32:19.3956404Z * [new branch] gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base 2025-12-04T10:32:19.3956495Z * [new branch] gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head 2025-12-04T10:32:19.3956583Z * [new branch] gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig 2025-12-04T10:32:19.3956672Z * [new branch] gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base 2025-12-04T10:32:19.3956761Z * [new branch] gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head 2025-12-04T10:32:19.3956851Z * [new branch] gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig 2025-12-04T10:32:19.3956941Z * [new branch] gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base 2025-12-04T10:32:19.3957030Z * [new branch] gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head 2025-12-04T10:32:19.3957118Z * [new branch] gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig 2025-12-04T10:32:19.3957209Z * [new branch] gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base 2025-12-04T10:32:19.3957297Z * [new branch] gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head 2025-12-04T10:32:19.3957384Z * [new branch] gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig 2025-12-04T10:32:19.3957473Z * [new branch] gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base 2025-12-04T10:32:19.3957559Z * [new branch] gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head 2025-12-04T10:32:19.3957651Z * [new branch] gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig 2025-12-04T10:32:19.3957741Z * [new branch] gh/guilhermeleobas/247/base -> origin/gh/guilhermeleobas/247/base 2025-12-04T10:32:19.3957830Z * [new branch] gh/guilhermeleobas/247/head -> origin/gh/guilhermeleobas/247/head 2025-12-04T10:32:19.3957952Z * [new branch] gh/guilhermeleobas/247/orig -> origin/gh/guilhermeleobas/247/orig 2025-12-04T10:32:19.3958040Z * [new branch] gh/guilhermeleobas/248/base -> origin/gh/guilhermeleobas/248/base 2025-12-04T10:32:19.3958128Z * [new branch] gh/guilhermeleobas/248/head -> origin/gh/guilhermeleobas/248/head 2025-12-04T10:32:19.3958219Z * [new branch] gh/guilhermeleobas/248/orig -> origin/gh/guilhermeleobas/248/orig 2025-12-04T10:32:19.3958309Z * [new branch] gh/guilhermeleobas/250/base -> origin/gh/guilhermeleobas/250/base 2025-12-04T10:32:19.3958396Z * [new branch] gh/guilhermeleobas/250/head -> origin/gh/guilhermeleobas/250/head 2025-12-04T10:32:19.3958486Z * [new branch] gh/guilhermeleobas/250/orig -> origin/gh/guilhermeleobas/250/orig 2025-12-04T10:32:19.3958574Z * [new branch] gh/guilhermeleobas/253/base -> origin/gh/guilhermeleobas/253/base 2025-12-04T10:32:19.3958662Z * [new branch] gh/guilhermeleobas/253/head -> origin/gh/guilhermeleobas/253/head 2025-12-04T10:32:19.3958751Z * [new branch] gh/guilhermeleobas/253/orig -> origin/gh/guilhermeleobas/253/orig 2025-12-04T10:32:19.3958838Z * [new branch] gh/guilhermeleobas/254/base -> origin/gh/guilhermeleobas/254/base 2025-12-04T10:32:19.3958951Z * [new branch] gh/guilhermeleobas/254/head -> origin/gh/guilhermeleobas/254/head 2025-12-04T10:32:19.3959041Z * [new branch] gh/guilhermeleobas/254/orig -> origin/gh/guilhermeleobas/254/orig 2025-12-04T10:32:19.3959129Z * [new branch] gh/guilhermeleobas/255/base -> origin/gh/guilhermeleobas/255/base 2025-12-04T10:32:19.3959219Z * [new branch] gh/guilhermeleobas/255/head -> origin/gh/guilhermeleobas/255/head 2025-12-04T10:32:19.3959306Z * [new branch] gh/guilhermeleobas/255/orig -> origin/gh/guilhermeleobas/255/orig 2025-12-04T10:32:19.3959395Z * [new branch] gh/guilhermeleobas/256/base -> origin/gh/guilhermeleobas/256/base 2025-12-04T10:32:19.3959486Z * [new branch] gh/guilhermeleobas/256/head -> origin/gh/guilhermeleobas/256/head 2025-12-04T10:32:19.3959608Z * [new branch] gh/guilhermeleobas/256/orig -> origin/gh/guilhermeleobas/256/orig 2025-12-04T10:32:19.3959699Z * [new branch] gh/guilhermeleobas/257/base -> origin/gh/guilhermeleobas/257/base 2025-12-04T10:32:19.3959789Z * [new branch] gh/guilhermeleobas/257/head -> origin/gh/guilhermeleobas/257/head 2025-12-04T10:32:19.3959878Z * [new branch] gh/guilhermeleobas/257/orig -> origin/gh/guilhermeleobas/257/orig 2025-12-04T10:32:19.3959965Z * [new branch] gh/guilhermeleobas/258/base -> origin/gh/guilhermeleobas/258/base 2025-12-04T10:32:19.3960054Z * [new branch] gh/guilhermeleobas/258/head -> origin/gh/guilhermeleobas/258/head 2025-12-04T10:32:19.3960141Z * [new branch] gh/guilhermeleobas/258/orig -> origin/gh/guilhermeleobas/258/orig 2025-12-04T10:32:19.3960228Z * [new branch] gh/guilhermeleobas/259/base -> origin/gh/guilhermeleobas/259/base 2025-12-04T10:32:19.3960319Z * [new branch] gh/guilhermeleobas/259/head -> origin/gh/guilhermeleobas/259/head 2025-12-04T10:32:19.3960407Z * [new branch] gh/guilhermeleobas/259/orig -> origin/gh/guilhermeleobas/259/orig 2025-12-04T10:32:19.3960494Z * [new branch] gh/guilhermeleobas/260/base -> origin/gh/guilhermeleobas/260/base 2025-12-04T10:32:19.3960581Z * [new branch] gh/guilhermeleobas/260/head -> origin/gh/guilhermeleobas/260/head 2025-12-04T10:32:19.3960667Z * [new branch] gh/guilhermeleobas/260/orig -> origin/gh/guilhermeleobas/260/orig 2025-12-04T10:32:19.3960754Z * [new branch] gh/guilhermeleobas/261/base -> origin/gh/guilhermeleobas/261/base 2025-12-04T10:32:19.3960841Z * [new branch] gh/guilhermeleobas/261/head -> origin/gh/guilhermeleobas/261/head 2025-12-04T10:32:19.3960985Z * [new branch] gh/guilhermeleobas/261/orig -> origin/gh/guilhermeleobas/261/orig 2025-12-04T10:32:19.3961073Z * [new branch] gh/guilhermeleobas/262/base -> origin/gh/guilhermeleobas/262/base 2025-12-04T10:32:19.3961160Z * [new branch] gh/guilhermeleobas/262/head -> origin/gh/guilhermeleobas/262/head 2025-12-04T10:32:19.3961249Z * [new branch] gh/guilhermeleobas/262/orig -> origin/gh/guilhermeleobas/262/orig 2025-12-04T10:32:19.3961335Z * [new branch] gh/guilhermeleobas/263/base -> origin/gh/guilhermeleobas/263/base 2025-12-04T10:32:19.3961422Z * [new branch] gh/guilhermeleobas/263/head -> origin/gh/guilhermeleobas/263/head 2025-12-04T10:32:19.3961510Z * [new branch] gh/guilhermeleobas/263/orig -> origin/gh/guilhermeleobas/263/orig 2025-12-04T10:32:19.3961597Z * [new branch] gh/guilhermeleobas/264/base -> origin/gh/guilhermeleobas/264/base 2025-12-04T10:32:19.3961687Z * [new branch] gh/guilhermeleobas/264/head -> origin/gh/guilhermeleobas/264/head 2025-12-04T10:32:19.3961775Z * [new branch] gh/guilhermeleobas/264/orig -> origin/gh/guilhermeleobas/264/orig 2025-12-04T10:32:19.3961862Z * [new branch] gh/guilhermeleobas/265/base -> origin/gh/guilhermeleobas/265/base 2025-12-04T10:32:19.3961986Z * [new branch] gh/guilhermeleobas/265/head -> origin/gh/guilhermeleobas/265/head 2025-12-04T10:32:19.3962073Z * [new branch] gh/guilhermeleobas/265/orig -> origin/gh/guilhermeleobas/265/orig 2025-12-04T10:32:19.3962159Z * [new branch] gh/guilhermeleobas/266/base -> origin/gh/guilhermeleobas/266/base 2025-12-04T10:32:19.3962246Z * [new branch] gh/guilhermeleobas/266/head -> origin/gh/guilhermeleobas/266/head 2025-12-04T10:32:19.3962334Z * [new branch] gh/guilhermeleobas/266/orig -> origin/gh/guilhermeleobas/266/orig 2025-12-04T10:32:19.3962422Z * [new branch] gh/guilhermeleobas/267/base -> origin/gh/guilhermeleobas/267/base 2025-12-04T10:32:19.3962508Z * [new branch] gh/guilhermeleobas/267/head -> origin/gh/guilhermeleobas/267/head 2025-12-04T10:32:19.3962596Z * [new branch] gh/guilhermeleobas/267/orig -> origin/gh/guilhermeleobas/267/orig 2025-12-04T10:32:19.3962679Z * [new branch] gh/hameerabbasi/1/base -> origin/gh/hameerabbasi/1/base 2025-12-04T10:32:19.3962758Z * [new branch] gh/hameerabbasi/1/head -> origin/gh/hameerabbasi/1/head 2025-12-04T10:32:19.3962833Z * [new branch] gh/hameerabbasi/2/base -> origin/gh/hameerabbasi/2/base 2025-12-04T10:32:19.3962908Z * [new branch] gh/hameerabbasi/2/head -> origin/gh/hameerabbasi/2/head 2025-12-04T10:32:19.3962983Z * [new branch] gh/hameerabbasi/2/orig -> origin/gh/hameerabbasi/2/orig 2025-12-04T10:32:19.3963058Z * [new branch] gh/hameerabbasi/3/base -> origin/gh/hameerabbasi/3/base 2025-12-04T10:32:19.3963132Z * [new branch] gh/hameerabbasi/3/head -> origin/gh/hameerabbasi/3/head 2025-12-04T10:32:19.3963206Z * [new branch] gh/hameerabbasi/3/orig -> origin/gh/hameerabbasi/3/orig 2025-12-04T10:32:19.3963279Z * [new branch] gh/hameerabbasi/4/base -> origin/gh/hameerabbasi/4/base 2025-12-04T10:32:19.3963354Z * [new branch] gh/hameerabbasi/4/head -> origin/gh/hameerabbasi/4/head 2025-12-04T10:32:19.3963428Z * [new branch] gh/hameerabbasi/4/orig -> origin/gh/hameerabbasi/4/orig 2025-12-04T10:32:19.3963498Z * [new branch] gh/huydhn/1/next -> origin/gh/huydhn/1/next 2025-12-04T10:32:19.3963567Z * [new branch] gh/huydhn/2/next -> origin/gh/huydhn/2/next 2025-12-04T10:32:19.3963635Z * [new branch] gh/huydhn/3/next -> origin/gh/huydhn/3/next 2025-12-04T10:32:19.3963731Z * [new branch] gh/huydhn/4/next -> origin/gh/huydhn/4/next 2025-12-04T10:32:19.3963796Z * [new branch] gh/huydhn/5/next -> origin/gh/huydhn/5/next 2025-12-04T10:32:19.3963860Z * [new branch] gh/huydhn/6/next -> origin/gh/huydhn/6/next 2025-12-04T10:32:19.3963926Z * [new branch] gh/int3/97/base -> origin/gh/int3/97/base 2025-12-04T10:32:19.3963992Z * [new branch] gh/int3/97/head -> origin/gh/int3/97/head 2025-12-04T10:32:19.3964066Z * [new branch] gh/isuruf/101/base -> origin/gh/isuruf/101/base 2025-12-04T10:32:19.3964134Z * [new branch] gh/isuruf/101/head -> origin/gh/isuruf/101/head 2025-12-04T10:32:19.3964201Z * [new branch] gh/isuruf/146/base -> origin/gh/isuruf/146/base 2025-12-04T10:32:19.3964269Z * [new branch] gh/isuruf/146/head -> origin/gh/isuruf/146/head 2025-12-04T10:32:19.3964336Z * [new branch] gh/isuruf/146/orig -> origin/gh/isuruf/146/orig 2025-12-04T10:32:19.3964403Z * [new branch] gh/isuruf/158/base -> origin/gh/isuruf/158/base 2025-12-04T10:32:19.3964469Z * [new branch] gh/isuruf/158/head -> origin/gh/isuruf/158/head 2025-12-04T10:32:19.3964535Z * [new branch] gh/isuruf/159/base -> origin/gh/isuruf/159/base 2025-12-04T10:32:19.3964634Z * [new branch] gh/isuruf/159/head -> origin/gh/isuruf/159/head 2025-12-04T10:32:19.3964700Z * [new branch] gh/isuruf/160/base -> origin/gh/isuruf/160/base 2025-12-04T10:32:19.3964765Z * [new branch] gh/isuruf/160/head -> origin/gh/isuruf/160/head 2025-12-04T10:32:19.3964832Z * [new branch] gh/isuruf/160/orig -> origin/gh/isuruf/160/orig 2025-12-04T10:32:19.3964899Z * [new branch] gh/isuruf/81/base -> origin/gh/isuruf/81/base 2025-12-04T10:32:19.3964966Z * [new branch] gh/isuruf/81/head -> origin/gh/isuruf/81/head 2025-12-04T10:32:19.3965033Z * [new branch] gh/isuruf/81/orig -> origin/gh/isuruf/81/orig 2025-12-04T10:32:19.3965105Z * [new branch] gh/jamesjwu/176/base -> origin/gh/jamesjwu/176/base 2025-12-04T10:32:19.3965177Z * [new branch] gh/jamesjwu/176/head -> origin/gh/jamesjwu/176/head 2025-12-04T10:32:19.3965251Z * [new branch] gh/jamesjwu/176/orig -> origin/gh/jamesjwu/176/orig 2025-12-04T10:32:19.3965323Z * [new branch] gh/jamesjwu/187/base -> origin/gh/jamesjwu/187/base 2025-12-04T10:32:19.3965394Z * [new branch] gh/jamesjwu/187/head -> origin/gh/jamesjwu/187/head 2025-12-04T10:32:19.3965464Z * [new branch] gh/jamesjwu/187/orig -> origin/gh/jamesjwu/187/orig 2025-12-04T10:32:19.3965533Z * [new branch] gh/jamesjwu/196/base -> origin/gh/jamesjwu/196/base 2025-12-04T10:32:19.3965604Z * [new branch] gh/jamesjwu/196/head -> origin/gh/jamesjwu/196/head 2025-12-04T10:32:19.3965675Z * [new branch] gh/jamesjwu/196/orig -> origin/gh/jamesjwu/196/orig 2025-12-04T10:32:19.3965745Z * [new branch] gh/jamesjwu/198/base -> origin/gh/jamesjwu/198/base 2025-12-04T10:32:19.3965815Z * [new branch] gh/jamesjwu/198/head -> origin/gh/jamesjwu/198/head 2025-12-04T10:32:19.3965886Z * [new branch] gh/jamesjwu/198/orig -> origin/gh/jamesjwu/198/orig 2025-12-04T10:32:19.3965955Z * [new branch] gh/jamesjwu/207/base -> origin/gh/jamesjwu/207/base 2025-12-04T10:32:19.3966025Z * [new branch] gh/jamesjwu/207/head -> origin/gh/jamesjwu/207/head 2025-12-04T10:32:19.3966094Z * [new branch] gh/jamesjwu/207/orig -> origin/gh/jamesjwu/207/orig 2025-12-04T10:32:19.3966163Z * [new branch] gh/jamesjwu/208/base -> origin/gh/jamesjwu/208/base 2025-12-04T10:32:19.3966278Z * [new branch] gh/jamesjwu/208/head -> origin/gh/jamesjwu/208/head 2025-12-04T10:32:19.3966347Z * [new branch] gh/jamesjwu/208/orig -> origin/gh/jamesjwu/208/orig 2025-12-04T10:32:19.3966418Z * [new branch] gh/jamesjwu/52/base -> origin/gh/jamesjwu/52/base 2025-12-04T10:32:19.3966489Z * [new branch] gh/jamesjwu/52/head -> origin/gh/jamesjwu/52/head 2025-12-04T10:32:19.3966559Z * [new branch] gh/jamesjwu/53/base -> origin/gh/jamesjwu/53/base 2025-12-04T10:32:19.3966628Z * [new branch] gh/jamesjwu/53/head -> origin/gh/jamesjwu/53/head 2025-12-04T10:32:19.3966701Z * [new branch] gh/jamesjwu/54/base -> origin/gh/jamesjwu/54/base 2025-12-04T10:32:19.3966770Z * [new branch] gh/jamesjwu/54/head -> origin/gh/jamesjwu/54/head 2025-12-04T10:32:19.3966839Z * [new branch] gh/jamesjwu/55/base -> origin/gh/jamesjwu/55/base 2025-12-04T10:32:19.3966913Z * [new branch] gh/jamesjwu/55/head -> origin/gh/jamesjwu/55/head 2025-12-04T10:32:19.3966980Z * [new branch] gh/jamesjwu/56/base -> origin/gh/jamesjwu/56/base 2025-12-04T10:32:19.3967048Z * [new branch] gh/jamesjwu/56/head -> origin/gh/jamesjwu/56/head 2025-12-04T10:32:19.3967117Z * [new branch] gh/jamesjwu/57/base -> origin/gh/jamesjwu/57/base 2025-12-04T10:32:19.3967215Z * [new branch] gh/jamesjwu/57/head -> origin/gh/jamesjwu/57/head 2025-12-04T10:32:19.3967283Z * [new branch] gh/jamesjwu/58/base -> origin/gh/jamesjwu/58/base 2025-12-04T10:32:19.3967353Z * [new branch] gh/jamesjwu/58/head -> origin/gh/jamesjwu/58/head 2025-12-04T10:32:19.3967421Z * [new branch] gh/jamesjwu/59/base -> origin/gh/jamesjwu/59/base 2025-12-04T10:32:19.3967494Z * [new branch] gh/jamesjwu/59/head -> origin/gh/jamesjwu/59/head 2025-12-04T10:32:19.3967565Z * [new branch] gh/jamesjwu/60/base -> origin/gh/jamesjwu/60/base 2025-12-04T10:32:19.3967634Z * [new branch] gh/jamesjwu/60/head -> origin/gh/jamesjwu/60/head 2025-12-04T10:32:19.3967707Z * [new branch] gh/jamesjwu/61/base -> origin/gh/jamesjwu/61/base 2025-12-04T10:32:19.3967777Z * [new branch] gh/jamesjwu/61/head -> origin/gh/jamesjwu/61/head 2025-12-04T10:32:19.3967849Z * [new branch] gh/jamesjwu/62/base -> origin/gh/jamesjwu/62/base 2025-12-04T10:32:19.3967921Z * [new branch] gh/jamesjwu/62/head -> origin/gh/jamesjwu/62/head 2025-12-04T10:32:19.3967989Z * [new branch] gh/jamesjwu/63/base -> origin/gh/jamesjwu/63/base 2025-12-04T10:32:19.3968057Z * [new branch] gh/jamesjwu/63/head -> origin/gh/jamesjwu/63/head 2025-12-04T10:32:19.3968129Z * [new branch] gh/jamesjwu/64/base -> origin/gh/jamesjwu/64/base 2025-12-04T10:32:19.3968201Z * [new branch] gh/jamesjwu/64/head -> origin/gh/jamesjwu/64/head 2025-12-04T10:32:19.3968271Z * [new branch] gh/jamesjwu/65/base -> origin/gh/jamesjwu/65/base 2025-12-04T10:32:19.3968342Z * [new branch] gh/jamesjwu/65/head -> origin/gh/jamesjwu/65/head 2025-12-04T10:32:19.3968413Z * [new branch] gh/janeyx99/165/base -> origin/gh/janeyx99/165/base 2025-12-04T10:32:19.3968483Z * [new branch] gh/janeyx99/165/head -> origin/gh/janeyx99/165/head 2025-12-04T10:32:19.3968556Z * [new branch] gh/janeyx99/165/orig -> origin/gh/janeyx99/165/orig 2025-12-04T10:32:19.3968625Z * [new branch] gh/janeyx99/201/base -> origin/gh/janeyx99/201/base 2025-12-04T10:32:19.3968694Z * [new branch] gh/janeyx99/201/head -> origin/gh/janeyx99/201/head 2025-12-04T10:32:19.3968765Z * [new branch] gh/janeyx99/201/orig -> origin/gh/janeyx99/201/orig 2025-12-04T10:32:19.3968867Z * [new branch] gh/janeyx99/225/base -> origin/gh/janeyx99/225/base 2025-12-04T10:32:19.3968938Z * [new branch] gh/janeyx99/225/head -> origin/gh/janeyx99/225/head 2025-12-04T10:32:19.3969008Z * [new branch] gh/janeyx99/225/orig -> origin/gh/janeyx99/225/orig 2025-12-04T10:32:19.3969077Z * [new branch] gh/janeyx99/299/base -> origin/gh/janeyx99/299/base 2025-12-04T10:32:19.3969150Z * [new branch] gh/janeyx99/299/head -> origin/gh/janeyx99/299/head 2025-12-04T10:32:19.3969219Z * [new branch] gh/janeyx99/299/orig -> origin/gh/janeyx99/299/orig 2025-12-04T10:32:19.3969287Z * [new branch] gh/janeyx99/302/base -> origin/gh/janeyx99/302/base 2025-12-04T10:32:19.3969359Z * [new branch] gh/janeyx99/302/head -> origin/gh/janeyx99/302/head 2025-12-04T10:32:19.3969429Z * [new branch] gh/janeyx99/303/base -> origin/gh/janeyx99/303/base 2025-12-04T10:32:19.3969502Z * [new branch] gh/janeyx99/303/head -> origin/gh/janeyx99/303/head 2025-12-04T10:32:19.3969631Z * [new branch] gh/janeyx99/305/base -> origin/gh/janeyx99/305/base 2025-12-04T10:32:19.3969703Z * [new branch] gh/janeyx99/305/head -> origin/gh/janeyx99/305/head 2025-12-04T10:32:19.3969821Z * [new branch] gh/janeyx99/306/base -> origin/gh/janeyx99/306/base 2025-12-04T10:32:19.3969894Z * [new branch] gh/janeyx99/306/head -> origin/gh/janeyx99/306/head 2025-12-04T10:32:19.3969964Z * [new branch] gh/janeyx99/314/base -> origin/gh/janeyx99/314/base 2025-12-04T10:32:19.3970034Z * [new branch] gh/janeyx99/314/head -> origin/gh/janeyx99/314/head 2025-12-04T10:32:19.3970105Z * [new branch] gh/janeyx99/314/orig -> origin/gh/janeyx99/314/orig 2025-12-04T10:32:19.3970173Z * [new branch] gh/janeyx99/315/base -> origin/gh/janeyx99/315/base 2025-12-04T10:32:19.3970245Z * [new branch] gh/janeyx99/315/head -> origin/gh/janeyx99/315/head 2025-12-04T10:32:19.3970316Z * [new branch] gh/janeyx99/315/orig -> origin/gh/janeyx99/315/orig 2025-12-04T10:32:19.3970386Z * [new branch] gh/janeyx99/316/base -> origin/gh/janeyx99/316/base 2025-12-04T10:32:19.3970457Z * [new branch] gh/janeyx99/316/head -> origin/gh/janeyx99/316/head 2025-12-04T10:32:19.3970529Z * [new branch] gh/janeyx99/316/orig -> origin/gh/janeyx99/316/orig 2025-12-04T10:32:19.3970597Z * [new branch] gh/janeyx99/317/base -> origin/gh/janeyx99/317/base 2025-12-04T10:32:19.3970665Z * [new branch] gh/janeyx99/317/head -> origin/gh/janeyx99/317/head 2025-12-04T10:32:19.3970736Z * [new branch] gh/janeyx99/317/orig -> origin/gh/janeyx99/317/orig 2025-12-04T10:32:19.3970805Z * [new branch] gh/janeyx99/325/base -> origin/gh/janeyx99/325/base 2025-12-04T10:32:19.3970881Z * [new branch] gh/janeyx99/325/head -> origin/gh/janeyx99/325/head 2025-12-04T10:32:19.3970950Z * [new branch] gh/janeyx99/325/orig -> origin/gh/janeyx99/325/orig 2025-12-04T10:32:19.3971021Z * [new branch] gh/janeyx99/327/base -> origin/gh/janeyx99/327/base 2025-12-04T10:32:19.3971093Z * [new branch] gh/janeyx99/327/head -> origin/gh/janeyx99/327/head 2025-12-04T10:32:19.3971163Z * [new branch] gh/janeyx99/327/orig -> origin/gh/janeyx99/327/orig 2025-12-04T10:32:19.3971233Z * [new branch] gh/janeyx99/328/base -> origin/gh/janeyx99/328/base 2025-12-04T10:32:19.3971305Z * [new branch] gh/janeyx99/328/head -> origin/gh/janeyx99/328/head 2025-12-04T10:32:19.3971374Z * [new branch] gh/janeyx99/328/orig -> origin/gh/janeyx99/328/orig 2025-12-04T10:32:19.3971443Z * [new branch] gh/janeyx99/329/base -> origin/gh/janeyx99/329/base 2025-12-04T10:32:19.3971560Z * [new branch] gh/janeyx99/329/head -> origin/gh/janeyx99/329/head 2025-12-04T10:32:19.3971630Z * [new branch] gh/janeyx99/329/orig -> origin/gh/janeyx99/329/orig 2025-12-04T10:32:19.3971700Z * [new branch] gh/janeyx99/330/base -> origin/gh/janeyx99/330/base 2025-12-04T10:32:19.3971773Z * [new branch] gh/janeyx99/330/head -> origin/gh/janeyx99/330/head 2025-12-04T10:32:19.3971843Z * [new branch] gh/janeyx99/330/orig -> origin/gh/janeyx99/330/orig 2025-12-04T10:32:19.3971913Z * [new branch] gh/janeyx99/331/base -> origin/gh/janeyx99/331/base 2025-12-04T10:32:19.3971985Z * [new branch] gh/janeyx99/331/head -> origin/gh/janeyx99/331/head 2025-12-04T10:32:19.3972055Z * [new branch] gh/janeyx99/331/orig -> origin/gh/janeyx99/331/orig 2025-12-04T10:32:19.3972125Z * [new branch] gh/janeyx99/332/base -> origin/gh/janeyx99/332/base 2025-12-04T10:32:19.3972198Z * [new branch] gh/janeyx99/332/head -> origin/gh/janeyx99/332/head 2025-12-04T10:32:19.3972268Z * [new branch] gh/janeyx99/332/orig -> origin/gh/janeyx99/332/orig 2025-12-04T10:32:19.3972339Z * [new branch] gh/janeyx99/333/base -> origin/gh/janeyx99/333/base 2025-12-04T10:32:19.3972441Z * [new branch] gh/janeyx99/333/head -> origin/gh/janeyx99/333/head 2025-12-04T10:32:19.3972512Z * [new branch] gh/janeyx99/333/orig -> origin/gh/janeyx99/333/orig 2025-12-04T10:32:19.3972581Z * [new branch] gh/janeyx99/88/base -> origin/gh/janeyx99/88/base 2025-12-04T10:32:19.3972649Z * [new branch] gh/janeyx99/88/head -> origin/gh/janeyx99/88/head 2025-12-04T10:32:19.3972717Z * [new branch] gh/janeyx99/88/orig -> origin/gh/janeyx99/88/orig 2025-12-04T10:32:19.3972786Z * [new branch] gh/jansel/360/base -> origin/gh/jansel/360/base 2025-12-04T10:32:19.3972856Z * [new branch] gh/jansel/360/head -> origin/gh/jansel/360/head 2025-12-04T10:32:19.3972924Z * [new branch] gh/jansel/451/base -> origin/gh/jansel/451/base 2025-12-04T10:32:19.3972993Z * [new branch] gh/jansel/451/head -> origin/gh/jansel/451/head 2025-12-04T10:32:19.3973062Z * [new branch] gh/jansel/451/orig -> origin/gh/jansel/451/orig 2025-12-04T10:32:19.3973127Z * [new branch] gh/jansel/462/base -> origin/gh/jansel/462/base 2025-12-04T10:32:19.3973195Z * [new branch] gh/jansel/462/head -> origin/gh/jansel/462/head 2025-12-04T10:32:19.3973262Z * [new branch] gh/jansel/462/orig -> origin/gh/jansel/462/orig 2025-12-04T10:32:19.3973328Z * [new branch] gh/jansel/533/base -> origin/gh/jansel/533/base 2025-12-04T10:32:19.3973395Z * [new branch] gh/jansel/533/head -> origin/gh/jansel/533/head 2025-12-04T10:32:19.3973464Z * [new branch] gh/jansel/533/orig -> origin/gh/jansel/533/orig 2025-12-04T10:32:19.3973531Z * [new branch] gh/jansel/552/base -> origin/gh/jansel/552/base 2025-12-04T10:32:19.3973598Z * [new branch] gh/jansel/552/head -> origin/gh/jansel/552/head 2025-12-04T10:32:19.3973665Z * [new branch] gh/jansel/552/orig -> origin/gh/jansel/552/orig 2025-12-04T10:32:19.3973732Z * [new branch] gh/jansel/553/base -> origin/gh/jansel/553/base 2025-12-04T10:32:19.3973801Z * [new branch] gh/jansel/553/head -> origin/gh/jansel/553/head 2025-12-04T10:32:19.3973868Z * [new branch] gh/jansel/553/orig -> origin/gh/jansel/553/orig 2025-12-04T10:32:19.3973937Z * [new branch] gh/jansel/554/base -> origin/gh/jansel/554/base 2025-12-04T10:32:19.3974004Z * [new branch] gh/jansel/554/head -> origin/gh/jansel/554/head 2025-12-04T10:32:19.3974102Z * [new branch] gh/jansel/554/orig -> origin/gh/jansel/554/orig 2025-12-04T10:32:19.3974170Z * [new branch] gh/jansel/555/base -> origin/gh/jansel/555/base 2025-12-04T10:32:19.3974238Z * [new branch] gh/jansel/555/head -> origin/gh/jansel/555/head 2025-12-04T10:32:19.3974306Z * [new branch] gh/jansel/555/orig -> origin/gh/jansel/555/orig 2025-12-04T10:32:19.3974373Z * [new branch] gh/jansel/556/base -> origin/gh/jansel/556/base 2025-12-04T10:32:19.3974440Z * [new branch] gh/jansel/556/head -> origin/gh/jansel/556/head 2025-12-04T10:32:19.3974506Z * [new branch] gh/jansel/556/orig -> origin/gh/jansel/556/orig 2025-12-04T10:32:19.3974577Z * [new branch] gh/jansel/557/base -> origin/gh/jansel/557/base 2025-12-04T10:32:19.3974645Z * [new branch] gh/jansel/557/head -> origin/gh/jansel/557/head 2025-12-04T10:32:19.3974713Z * [new branch] gh/jansel/557/orig -> origin/gh/jansel/557/orig 2025-12-04T10:32:19.3974783Z * [new branch] gh/jansel/558/base -> origin/gh/jansel/558/base 2025-12-04T10:32:19.3974850Z * [new branch] gh/jansel/558/head -> origin/gh/jansel/558/head 2025-12-04T10:32:19.3974942Z * [new branch] gh/jansel/558/orig -> origin/gh/jansel/558/orig 2025-12-04T10:32:19.3975013Z * [new branch] gh/jansel/559/base -> origin/gh/jansel/559/base 2025-12-04T10:32:19.3975080Z * [new branch] gh/jansel/559/head -> origin/gh/jansel/559/head 2025-12-04T10:32:19.3975145Z * [new branch] gh/jansel/559/orig -> origin/gh/jansel/559/orig 2025-12-04T10:32:19.3975213Z * [new branch] gh/jansel/560/base -> origin/gh/jansel/560/base 2025-12-04T10:32:19.3975280Z * [new branch] gh/jansel/560/head -> origin/gh/jansel/560/head 2025-12-04T10:32:19.3975348Z * [new branch] gh/jansel/560/orig -> origin/gh/jansel/560/orig 2025-12-04T10:32:19.3975418Z * [new branch] gh/jansel/561/base -> origin/gh/jansel/561/base 2025-12-04T10:32:19.3975485Z * [new branch] gh/jansel/561/head -> origin/gh/jansel/561/head 2025-12-04T10:32:19.3975555Z * [new branch] gh/jansel/561/orig -> origin/gh/jansel/561/orig 2025-12-04T10:32:19.3975621Z * [new branch] gh/jansel/562/base -> origin/gh/jansel/562/base 2025-12-04T10:32:19.3975687Z * [new branch] gh/jansel/562/head -> origin/gh/jansel/562/head 2025-12-04T10:32:19.3975756Z * [new branch] gh/jansel/562/orig -> origin/gh/jansel/562/orig 2025-12-04T10:32:19.3975822Z * [new branch] gh/jansel/563/base -> origin/gh/jansel/563/base 2025-12-04T10:32:19.3975889Z * [new branch] gh/jansel/563/head -> origin/gh/jansel/563/head 2025-12-04T10:32:19.3975959Z * [new branch] gh/jansel/563/orig -> origin/gh/jansel/563/orig 2025-12-04T10:32:19.3976026Z * [new branch] gh/jansel/564/base -> origin/gh/jansel/564/base 2025-12-04T10:32:19.3976093Z * [new branch] gh/jansel/564/head -> origin/gh/jansel/564/head 2025-12-04T10:32:19.3976163Z * [new branch] gh/jansel/564/orig -> origin/gh/jansel/564/orig 2025-12-04T10:32:19.3976228Z * [new branch] gh/jansel/565/base -> origin/gh/jansel/565/base 2025-12-04T10:32:19.3976295Z * [new branch] gh/jansel/565/head -> origin/gh/jansel/565/head 2025-12-04T10:32:19.3976364Z * [new branch] gh/jansel/565/orig -> origin/gh/jansel/565/orig 2025-12-04T10:32:19.3976429Z * [new branch] gh/jansel/566/base -> origin/gh/jansel/566/base 2025-12-04T10:32:19.3976496Z * [new branch] gh/jansel/566/head -> origin/gh/jansel/566/head 2025-12-04T10:32:19.3976610Z * [new branch] gh/jansel/566/orig -> origin/gh/jansel/566/orig 2025-12-04T10:32:19.3976677Z * [new branch] gh/jansel/567/base -> origin/gh/jansel/567/base 2025-12-04T10:32:19.3976744Z * [new branch] gh/jansel/567/head -> origin/gh/jansel/567/head 2025-12-04T10:32:19.3976814Z * [new branch] gh/jansel/567/orig -> origin/gh/jansel/567/orig 2025-12-04T10:32:19.3976880Z * [new branch] gh/jansel/568/base -> origin/gh/jansel/568/base 2025-12-04T10:32:19.3976946Z * [new branch] gh/jansel/568/head -> origin/gh/jansel/568/head 2025-12-04T10:32:19.3977015Z * [new branch] gh/jansel/568/orig -> origin/gh/jansel/568/orig 2025-12-04T10:32:19.3977082Z * [new branch] gh/jansel/569/base -> origin/gh/jansel/569/base 2025-12-04T10:32:19.3977151Z * [new branch] gh/jansel/569/head -> origin/gh/jansel/569/head 2025-12-04T10:32:19.3977222Z * [new branch] gh/jansel/569/orig -> origin/gh/jansel/569/orig 2025-12-04T10:32:19.3977288Z * [new branch] gh/jansel/570/base -> origin/gh/jansel/570/base 2025-12-04T10:32:19.3977356Z * [new branch] gh/jansel/570/head -> origin/gh/jansel/570/head 2025-12-04T10:32:19.3977451Z * [new branch] gh/jansel/570/orig -> origin/gh/jansel/570/orig 2025-12-04T10:32:19.3977519Z * [new branch] gh/jansel/571/base -> origin/gh/jansel/571/base 2025-12-04T10:32:19.3977588Z * [new branch] gh/jansel/571/head -> origin/gh/jansel/571/head 2025-12-04T10:32:19.3977654Z * [new branch] gh/jansel/571/orig -> origin/gh/jansel/571/orig 2025-12-04T10:32:19.3977721Z * [new branch] gh/jansel/572/base -> origin/gh/jansel/572/base 2025-12-04T10:32:19.3977790Z * [new branch] gh/jansel/572/head -> origin/gh/jansel/572/head 2025-12-04T10:32:19.3977861Z * [new branch] gh/jansel/572/orig -> origin/gh/jansel/572/orig 2025-12-04T10:32:19.3977928Z * [new branch] gh/jansel/573/base -> origin/gh/jansel/573/base 2025-12-04T10:32:19.3977997Z * [new branch] gh/jansel/573/head -> origin/gh/jansel/573/head 2025-12-04T10:32:19.3978064Z * [new branch] gh/jansel/573/orig -> origin/gh/jansel/573/orig 2025-12-04T10:32:19.3978133Z * [new branch] gh/jansel/574/base -> origin/gh/jansel/574/base 2025-12-04T10:32:19.3978201Z * [new branch] gh/jansel/574/head -> origin/gh/jansel/574/head 2025-12-04T10:32:19.3978267Z * [new branch] gh/jansel/574/orig -> origin/gh/jansel/574/orig 2025-12-04T10:32:19.3978333Z * [new branch] gh/jansel/575/base -> origin/gh/jansel/575/base 2025-12-04T10:32:19.3978401Z * [new branch] gh/jansel/575/head -> origin/gh/jansel/575/head 2025-12-04T10:32:19.3978471Z * [new branch] gh/jansel/575/orig -> origin/gh/jansel/575/orig 2025-12-04T10:32:19.3978538Z * [new branch] gh/jansel/576/base -> origin/gh/jansel/576/base 2025-12-04T10:32:19.3978607Z * [new branch] gh/jansel/576/head -> origin/gh/jansel/576/head 2025-12-04T10:32:19.3978672Z * [new branch] gh/jansel/576/orig -> origin/gh/jansel/576/orig 2025-12-04T10:32:19.3978755Z * [new branch] gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base 2025-12-04T10:32:19.3978837Z * [new branch] gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head 2025-12-04T10:32:19.3978914Z * [new branch] gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig 2025-12-04T10:32:19.3978991Z * [new branch] gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base 2025-12-04T10:32:19.3979066Z * [new branch] gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head 2025-12-04T10:32:19.3979370Z * [new branch] gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig 2025-12-04T10:32:19.3979444Z * [new branch] gh/jerryzh168/1/base -> origin/gh/jerryzh168/1/base 2025-12-04T10:32:19.3979515Z * [new branch] gh/jerryzh168/1/head -> origin/gh/jerryzh168/1/head 2025-12-04T10:32:19.3979703Z * [new branch] gh/jerryzh168/1/orig -> origin/gh/jerryzh168/1/orig 2025-12-04T10:32:19.3979782Z * [new branch] gh/jiayisunx/59/base -> origin/gh/jiayisunx/59/base 2025-12-04T10:32:19.3979853Z * [new branch] gh/jiayisunx/59/head -> origin/gh/jiayisunx/59/head 2025-12-04T10:32:19.3979925Z * [new branch] gh/jiayisunx/59/orig -> origin/gh/jiayisunx/59/orig 2025-12-04T10:32:19.3979999Z * [new branch] gh/jiayisunx/61/base -> origin/gh/jiayisunx/61/base 2025-12-04T10:32:19.3980070Z * [new branch] gh/jiayisunx/61/head -> origin/gh/jiayisunx/61/head 2025-12-04T10:32:19.3980146Z * [new branch] gh/jiayisunx/61/orig -> origin/gh/jiayisunx/61/orig 2025-12-04T10:32:19.3980218Z * [new branch] gh/jiayisunx/68/base -> origin/gh/jiayisunx/68/base 2025-12-04T10:32:19.3980287Z * [new branch] gh/jiayisunx/68/head -> origin/gh/jiayisunx/68/head 2025-12-04T10:32:19.3980403Z * [new branch] gh/jiayisunx/68/orig -> origin/gh/jiayisunx/68/orig 2025-12-04T10:32:19.3980475Z * [new branch] gh/jiayisunx/77/base -> origin/gh/jiayisunx/77/base 2025-12-04T10:32:19.3980546Z * [new branch] gh/jiayisunx/77/head -> origin/gh/jiayisunx/77/head 2025-12-04T10:32:19.3980619Z * [new branch] gh/jiayisunx/77/orig -> origin/gh/jiayisunx/77/orig 2025-12-04T10:32:19.3980689Z * [new branch] gh/jiayisunx/78/base -> origin/gh/jiayisunx/78/base 2025-12-04T10:32:19.3980758Z * [new branch] gh/jiayisunx/78/head -> origin/gh/jiayisunx/78/head 2025-12-04T10:32:19.3980835Z * [new branch] gh/jiayisunx/78/orig -> origin/gh/jiayisunx/78/orig 2025-12-04T10:32:19.3980906Z * [new branch] gh/jiayisunx/79/base -> origin/gh/jiayisunx/79/base 2025-12-04T10:32:19.3980977Z * [new branch] gh/jiayisunx/79/head -> origin/gh/jiayisunx/79/head 2025-12-04T10:32:19.3981050Z * [new branch] gh/jiayisunx/79/orig -> origin/gh/jiayisunx/79/orig 2025-12-04T10:32:19.3981121Z * [new branch] gh/jiayisunx/82/base -> origin/gh/jiayisunx/82/base 2025-12-04T10:32:19.3981190Z * [new branch] gh/jiayisunx/82/head -> origin/gh/jiayisunx/82/head 2025-12-04T10:32:19.3981260Z * [new branch] gh/jiayisunx/82/orig -> origin/gh/jiayisunx/82/orig 2025-12-04T10:32:19.3981331Z * [new branch] gh/jiayisunx/83/base -> origin/gh/jiayisunx/83/base 2025-12-04T10:32:19.3981400Z * [new branch] gh/jiayisunx/83/head -> origin/gh/jiayisunx/83/head 2025-12-04T10:32:19.3981475Z * [new branch] gh/jiayisunx/83/orig -> origin/gh/jiayisunx/83/orig 2025-12-04T10:32:19.3981547Z * [new branch] gh/jiayisunx/84/base -> origin/gh/jiayisunx/84/base 2025-12-04T10:32:19.3981619Z * [new branch] gh/jiayisunx/84/head -> origin/gh/jiayisunx/84/head 2025-12-04T10:32:19.3981692Z * [new branch] gh/jiayisunx/84/orig -> origin/gh/jiayisunx/84/orig 2025-12-04T10:32:19.3981763Z * [new branch] gh/jiayisunx/85/base -> origin/gh/jiayisunx/85/base 2025-12-04T10:32:19.3981833Z * [new branch] gh/jiayisunx/85/head -> origin/gh/jiayisunx/85/head 2025-12-04T10:32:19.3981905Z * [new branch] gh/jiayisunx/85/orig -> origin/gh/jiayisunx/85/orig 2025-12-04T10:32:19.3981976Z * [new branch] gh/jiayisunx/86/base -> origin/gh/jiayisunx/86/base 2025-12-04T10:32:19.3982050Z * [new branch] gh/jiayisunx/86/head -> origin/gh/jiayisunx/86/head 2025-12-04T10:32:19.3982170Z * [new branch] gh/jiayisunx/86/orig -> origin/gh/jiayisunx/86/orig 2025-12-04T10:32:19.3982241Z * [new branch] gh/jiayisunx/87/base -> origin/gh/jiayisunx/87/base 2025-12-04T10:32:19.3982313Z * [new branch] gh/jiayisunx/87/head -> origin/gh/jiayisunx/87/head 2025-12-04T10:32:19.3982386Z * [new branch] gh/jiayisunx/87/orig -> origin/gh/jiayisunx/87/orig 2025-12-04T10:32:19.3982457Z * [new branch] gh/jiayisunx/88/base -> origin/gh/jiayisunx/88/base 2025-12-04T10:32:19.3982528Z * [new branch] gh/jiayisunx/88/head -> origin/gh/jiayisunx/88/head 2025-12-04T10:32:19.3982598Z * [new branch] gh/jiayisunx/88/orig -> origin/gh/jiayisunx/88/orig 2025-12-04T10:32:19.3982669Z * [new branch] gh/jiayisunx/89/base -> origin/gh/jiayisunx/89/base 2025-12-04T10:32:19.3982741Z * [new branch] gh/jiayisunx/89/head -> origin/gh/jiayisunx/89/head 2025-12-04T10:32:19.3982814Z * [new branch] gh/jiayisunx/89/orig -> origin/gh/jiayisunx/89/orig 2025-12-04T10:32:19.3982885Z * [new branch] gh/jiayisunx/90/base -> origin/gh/jiayisunx/90/base 2025-12-04T10:32:19.3982958Z * [new branch] gh/jiayisunx/90/head -> origin/gh/jiayisunx/90/head 2025-12-04T10:32:19.3983060Z * [new branch] gh/jiayisunx/90/orig -> origin/gh/jiayisunx/90/orig 2025-12-04T10:32:19.3983137Z * [new branch] gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base 2025-12-04T10:32:19.3983214Z * [new branch] gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head 2025-12-04T10:32:19.3983283Z * [new branch] gh/jturney/1/base -> origin/gh/jturney/1/base 2025-12-04T10:32:19.3983352Z * [new branch] gh/jturney/1/head -> origin/gh/jturney/1/head 2025-12-04T10:32:19.3983422Z * [new branch] gh/jturney/1/orig -> origin/gh/jturney/1/orig 2025-12-04T10:32:19.3983487Z * [new branch] gh/jturney/2/base -> origin/gh/jturney/2/base 2025-12-04T10:32:19.3983554Z * [new branch] gh/jturney/2/head -> origin/gh/jturney/2/head 2025-12-04T10:32:19.3983621Z * [new branch] gh/jturney/2/orig -> origin/gh/jturney/2/orig 2025-12-04T10:32:19.3983698Z * [new branch] gh/karthickai/10/base -> origin/gh/karthickai/10/base 2025-12-04T10:32:19.3983774Z * [new branch] gh/karthickai/10/head -> origin/gh/karthickai/10/head 2025-12-04T10:32:19.3983847Z * [new branch] gh/karthickai/10/orig -> origin/gh/karthickai/10/orig 2025-12-04T10:32:19.3983920Z * [new branch] gh/karthickai/11/base -> origin/gh/karthickai/11/base 2025-12-04T10:32:19.3983996Z * [new branch] gh/karthickai/11/head -> origin/gh/karthickai/11/head 2025-12-04T10:32:19.3984070Z * [new branch] gh/karthickai/11/orig -> origin/gh/karthickai/11/orig 2025-12-04T10:32:19.3984141Z * [new branch] gh/karthickai/12/base -> origin/gh/karthickai/12/base 2025-12-04T10:32:19.3984216Z * [new branch] gh/karthickai/12/head -> origin/gh/karthickai/12/head 2025-12-04T10:32:19.3984289Z * [new branch] gh/karthickai/12/orig -> origin/gh/karthickai/12/orig 2025-12-04T10:32:19.3984363Z * [new branch] gh/karthickai/13/base -> origin/gh/karthickai/13/base 2025-12-04T10:32:19.3984438Z * [new branch] gh/karthickai/13/head -> origin/gh/karthickai/13/head 2025-12-04T10:32:19.3984509Z * [new branch] gh/karthickai/13/orig -> origin/gh/karthickai/13/orig 2025-12-04T10:32:19.3984581Z * [new branch] gh/karthickai/14/base -> origin/gh/karthickai/14/base 2025-12-04T10:32:19.3984655Z * [new branch] gh/karthickai/14/head -> origin/gh/karthickai/14/head 2025-12-04T10:32:19.3984758Z * [new branch] gh/karthickai/14/orig -> origin/gh/karthickai/14/orig 2025-12-04T10:32:19.3984831Z * [new branch] gh/karthickai/15/base -> origin/gh/karthickai/15/base 2025-12-04T10:32:19.3984906Z * [new branch] gh/karthickai/15/head -> origin/gh/karthickai/15/head 2025-12-04T10:32:19.3984980Z * [new branch] gh/karthickai/15/orig -> origin/gh/karthickai/15/orig 2025-12-04T10:32:19.3985054Z * [new branch] gh/karthickai/16/base -> origin/gh/karthickai/16/base 2025-12-04T10:32:19.3985128Z * [new branch] gh/karthickai/16/head -> origin/gh/karthickai/16/head 2025-12-04T10:32:19.3985201Z * [new branch] gh/karthickai/16/orig -> origin/gh/karthickai/16/orig 2025-12-04T10:32:19.3985276Z * [new branch] gh/karthickai/17/base -> origin/gh/karthickai/17/base 2025-12-04T10:32:19.3985349Z * [new branch] gh/karthickai/17/head -> origin/gh/karthickai/17/head 2025-12-04T10:32:19.3985423Z * [new branch] gh/karthickai/17/orig -> origin/gh/karthickai/17/orig 2025-12-04T10:32:19.3985497Z * [new branch] gh/karthickai/18/base -> origin/gh/karthickai/18/base 2025-12-04T10:32:19.3985571Z * [new branch] gh/karthickai/18/head -> origin/gh/karthickai/18/head 2025-12-04T10:32:19.3985683Z * [new branch] gh/karthickai/18/orig -> origin/gh/karthickai/18/orig 2025-12-04T10:32:19.3985757Z * [new branch] gh/karthickai/19/base -> origin/gh/karthickai/19/base 2025-12-04T10:32:19.3985829Z * [new branch] gh/karthickai/19/head -> origin/gh/karthickai/19/head 2025-12-04T10:32:19.3985900Z * [new branch] gh/karthickai/19/orig -> origin/gh/karthickai/19/orig 2025-12-04T10:32:19.3985976Z * [new branch] gh/karthickai/20/base -> origin/gh/karthickai/20/base 2025-12-04T10:32:19.3986049Z * [new branch] gh/karthickai/20/head -> origin/gh/karthickai/20/head 2025-12-04T10:32:19.3986122Z * [new branch] gh/karthickai/20/orig -> origin/gh/karthickai/20/orig 2025-12-04T10:32:19.3986196Z * [new branch] gh/karthickai/21/base -> origin/gh/karthickai/21/base 2025-12-04T10:32:19.3986269Z * [new branch] gh/karthickai/21/head -> origin/gh/karthickai/21/head 2025-12-04T10:32:19.3986342Z * [new branch] gh/karthickai/21/orig -> origin/gh/karthickai/21/orig 2025-12-04T10:32:19.3986416Z * [new branch] gh/karthickai/22/base -> origin/gh/karthickai/22/base 2025-12-04T10:32:19.3986488Z * [new branch] gh/karthickai/22/head -> origin/gh/karthickai/22/head 2025-12-04T10:32:19.3986560Z * [new branch] gh/karthickai/22/orig -> origin/gh/karthickai/22/orig 2025-12-04T10:32:19.3986634Z * [new branch] gh/karthickai/23/base -> origin/gh/karthickai/23/base 2025-12-04T10:32:19.3986705Z * [new branch] gh/karthickai/23/head -> origin/gh/karthickai/23/head 2025-12-04T10:32:19.3986778Z * [new branch] gh/karthickai/23/orig -> origin/gh/karthickai/23/orig 2025-12-04T10:32:19.3986852Z * [new branch] gh/karthickai/24/base -> origin/gh/karthickai/24/base 2025-12-04T10:32:19.3986923Z * [new branch] gh/karthickai/24/head -> origin/gh/karthickai/24/head 2025-12-04T10:32:19.3986999Z * [new branch] gh/karthickai/24/orig -> origin/gh/karthickai/24/orig 2025-12-04T10:32:19.3987072Z * [new branch] gh/karthickai/25/base -> origin/gh/karthickai/25/base 2025-12-04T10:32:19.3987144Z * [new branch] gh/karthickai/25/head -> origin/gh/karthickai/25/head 2025-12-04T10:32:19.3987217Z * [new branch] gh/karthickai/25/orig -> origin/gh/karthickai/25/orig 2025-12-04T10:32:19.3987289Z * [new branch] gh/karthickai/26/base -> origin/gh/karthickai/26/base 2025-12-04T10:32:19.3987361Z * [new branch] gh/karthickai/26/head -> origin/gh/karthickai/26/head 2025-12-04T10:32:19.3987464Z * [new branch] gh/karthickai/26/orig -> origin/gh/karthickai/26/orig 2025-12-04T10:32:19.3987537Z * [new branch] gh/karthickai/6/base -> origin/gh/karthickai/6/base 2025-12-04T10:32:19.3987610Z * [new branch] gh/karthickai/6/head -> origin/gh/karthickai/6/head 2025-12-04T10:32:19.3987684Z * [new branch] gh/karthickai/6/orig -> origin/gh/karthickai/6/orig 2025-12-04T10:32:19.3987751Z * [new branch] gh/krocki/1/base -> origin/gh/krocki/1/base 2025-12-04T10:32:19.3987819Z * [new branch] gh/krocki/1/head -> origin/gh/krocki/1/head 2025-12-04T10:32:19.3987887Z * [new branch] gh/krocki/1/orig -> origin/gh/krocki/1/orig 2025-12-04T10:32:19.3987952Z * [new branch] gh/krocki/2/base -> origin/gh/krocki/2/base 2025-12-04T10:32:19.3988016Z * [new branch] gh/krocki/2/head -> origin/gh/krocki/2/head 2025-12-04T10:32:19.3988084Z * [new branch] gh/krocki/2/orig -> origin/gh/krocki/2/orig 2025-12-04T10:32:19.3988164Z * [new branch] gh/kurtamohler/60/base -> origin/gh/kurtamohler/60/base 2025-12-04T10:32:19.3988241Z * [new branch] gh/kurtamohler/60/head -> origin/gh/kurtamohler/60/head 2025-12-04T10:32:19.3988354Z * [new branch] gh/kurtamohler/60/orig -> origin/gh/kurtamohler/60/orig 2025-12-04T10:32:19.3988429Z * [new branch] gh/kurtamohler/61/base -> origin/gh/kurtamohler/61/base 2025-12-04T10:32:19.3988503Z * [new branch] gh/kurtamohler/61/head -> origin/gh/kurtamohler/61/head 2025-12-04T10:32:19.3988578Z * [new branch] gh/kurtamohler/61/orig -> origin/gh/kurtamohler/61/orig 2025-12-04T10:32:19.3988652Z * [new branch] gh/kurtamohler/62/base -> origin/gh/kurtamohler/62/base 2025-12-04T10:32:19.3988733Z * [new branch] gh/kurtamohler/62/head -> origin/gh/kurtamohler/62/head 2025-12-04T10:32:19.3988805Z * [new branch] gh/kurtamohler/62/orig -> origin/gh/kurtamohler/62/orig 2025-12-04T10:32:19.3988878Z * [new branch] gh/kurtamohler/63/base -> origin/gh/kurtamohler/63/base 2025-12-04T10:32:19.3988954Z * [new branch] gh/kurtamohler/63/head -> origin/gh/kurtamohler/63/head 2025-12-04T10:32:19.3989030Z * [new branch] gh/kurtamohler/63/orig -> origin/gh/kurtamohler/63/orig 2025-12-04T10:32:19.3989105Z * [new branch] gh/kurtamohler/64/base -> origin/gh/kurtamohler/64/base 2025-12-04T10:32:19.3989180Z * [new branch] gh/kurtamohler/64/head -> origin/gh/kurtamohler/64/head 2025-12-04T10:32:19.3989255Z * [new branch] gh/kurtamohler/64/orig -> origin/gh/kurtamohler/64/orig 2025-12-04T10:32:19.3989329Z * [new branch] gh/kurtamohler/65/base -> origin/gh/kurtamohler/65/base 2025-12-04T10:32:19.3989407Z * [new branch] gh/kurtamohler/65/head -> origin/gh/kurtamohler/65/head 2025-12-04T10:32:19.3989481Z * [new branch] gh/kurtamohler/65/orig -> origin/gh/kurtamohler/65/orig 2025-12-04T10:32:19.3989554Z * [new branch] gh/kurtamohler/66/base -> origin/gh/kurtamohler/66/base 2025-12-04T10:32:19.3989671Z * [new branch] gh/kurtamohler/66/head -> origin/gh/kurtamohler/66/head 2025-12-04T10:32:19.3989747Z * [new branch] gh/kurtamohler/66/orig -> origin/gh/kurtamohler/66/orig 2025-12-04T10:32:19.3989820Z * [new branch] gh/kurtamohler/67/base -> origin/gh/kurtamohler/67/base 2025-12-04T10:32:19.3989895Z * [new branch] gh/kurtamohler/67/head -> origin/gh/kurtamohler/67/head 2025-12-04T10:32:19.3989968Z * [new branch] gh/kurtamohler/67/orig -> origin/gh/kurtamohler/67/orig 2025-12-04T10:32:19.3990042Z * [new branch] gh/kwen2501/130/base -> origin/gh/kwen2501/130/base 2025-12-04T10:32:19.3990164Z * [new branch] gh/kwen2501/130/head -> origin/gh/kwen2501/130/head 2025-12-04T10:32:19.3990234Z * [new branch] gh/kwen2501/130/orig -> origin/gh/kwen2501/130/orig 2025-12-04T10:32:19.3990306Z * [new branch] gh/kwen2501/170/base -> origin/gh/kwen2501/170/base 2025-12-04T10:32:19.3990378Z * [new branch] gh/kwen2501/170/head -> origin/gh/kwen2501/170/head 2025-12-04T10:32:19.3990448Z * [new branch] gh/kwen2501/187/base -> origin/gh/kwen2501/187/base 2025-12-04T10:32:19.3990520Z * [new branch] gh/kwen2501/187/head -> origin/gh/kwen2501/187/head 2025-12-04T10:32:19.3990590Z * [new branch] gh/kwen2501/187/orig -> origin/gh/kwen2501/187/orig 2025-12-04T10:32:19.3990660Z * [new branch] gh/kwen2501/188/base -> origin/gh/kwen2501/188/base 2025-12-04T10:32:19.3990732Z * [new branch] gh/kwen2501/188/head -> origin/gh/kwen2501/188/head 2025-12-04T10:32:19.3990803Z * [new branch] gh/kwen2501/188/orig -> origin/gh/kwen2501/188/orig 2025-12-04T10:32:19.3990872Z * [new branch] gh/kwen2501/211/base -> origin/gh/kwen2501/211/base 2025-12-04T10:32:19.3990943Z * [new branch] gh/kwen2501/211/head -> origin/gh/kwen2501/211/head 2025-12-04T10:32:19.3991059Z * [new branch] gh/kwen2501/224/base -> origin/gh/kwen2501/224/base 2025-12-04T10:32:19.3991128Z * [new branch] gh/kwen2501/224/head -> origin/gh/kwen2501/224/head 2025-12-04T10:32:19.3991200Z * [new branch] gh/kwen2501/224/orig -> origin/gh/kwen2501/224/orig 2025-12-04T10:32:19.3991269Z * [new branch] gh/kwen2501/228/base -> origin/gh/kwen2501/228/base 2025-12-04T10:32:19.3991338Z * [new branch] gh/kwen2501/228/head -> origin/gh/kwen2501/228/head 2025-12-04T10:32:19.3991410Z * [new branch] gh/kwen2501/228/orig -> origin/gh/kwen2501/228/orig 2025-12-04T10:32:19.3991480Z * [new branch] gh/kwen2501/234/base -> origin/gh/kwen2501/234/base 2025-12-04T10:32:19.3991550Z * [new branch] gh/kwen2501/234/head -> origin/gh/kwen2501/234/head 2025-12-04T10:32:19.3991620Z * [new branch] gh/kwen2501/234/orig -> origin/gh/kwen2501/234/orig 2025-12-04T10:32:19.3991690Z * [new branch] gh/kwen2501/235/base -> origin/gh/kwen2501/235/base 2025-12-04T10:32:19.3991760Z * [new branch] gh/kwen2501/235/head -> origin/gh/kwen2501/235/head 2025-12-04T10:32:19.3991828Z * [new branch] gh/kwen2501/235/orig -> origin/gh/kwen2501/235/orig 2025-12-04T10:32:19.3991897Z * [new branch] gh/kwen2501/236/base -> origin/gh/kwen2501/236/base 2025-12-04T10:32:19.3991968Z * [new branch] gh/kwen2501/236/head -> origin/gh/kwen2501/236/head 2025-12-04T10:32:19.3992036Z * [new branch] gh/kwen2501/236/orig -> origin/gh/kwen2501/236/orig 2025-12-04T10:32:19.3992107Z * [new branch] gh/kwen2501/237/base -> origin/gh/kwen2501/237/base 2025-12-04T10:32:19.3992179Z * [new branch] gh/kwen2501/237/head -> origin/gh/kwen2501/237/head 2025-12-04T10:32:19.3992247Z * [new branch] gh/kwen2501/237/orig -> origin/gh/kwen2501/237/orig 2025-12-04T10:32:19.3992317Z * [new branch] gh/kwen2501/238/base -> origin/gh/kwen2501/238/base 2025-12-04T10:32:19.3992387Z * [new branch] gh/kwen2501/238/head -> origin/gh/kwen2501/238/head 2025-12-04T10:32:19.3992455Z * [new branch] gh/kwen2501/238/orig -> origin/gh/kwen2501/238/orig 2025-12-04T10:32:19.3992524Z * [new branch] gh/kwen2501/240/base -> origin/gh/kwen2501/240/base 2025-12-04T10:32:19.3992594Z * [new branch] gh/kwen2501/240/head -> origin/gh/kwen2501/240/head 2025-12-04T10:32:19.3992663Z * [new branch] gh/kwen2501/240/orig -> origin/gh/kwen2501/240/orig 2025-12-04T10:32:19.3992757Z * [new branch] gh/kwen2501/241/base -> origin/gh/kwen2501/241/base 2025-12-04T10:32:19.3992828Z * [new branch] gh/kwen2501/241/head -> origin/gh/kwen2501/241/head 2025-12-04T10:32:19.3992897Z * [new branch] gh/kwen2501/241/orig -> origin/gh/kwen2501/241/orig 2025-12-04T10:32:19.3992968Z * [new branch] gh/kwen2501/247/base -> origin/gh/kwen2501/247/base 2025-12-04T10:32:19.3993039Z * [new branch] gh/kwen2501/247/head -> origin/gh/kwen2501/247/head 2025-12-04T10:32:19.3993108Z * [new branch] gh/kwen2501/247/orig -> origin/gh/kwen2501/247/orig 2025-12-04T10:32:19.3993180Z * [new branch] gh/kwen2501/252/base -> origin/gh/kwen2501/252/base 2025-12-04T10:32:19.3993249Z * [new branch] gh/kwen2501/252/head -> origin/gh/kwen2501/252/head 2025-12-04T10:32:19.3993320Z * [new branch] gh/kwen2501/252/orig -> origin/gh/kwen2501/252/orig 2025-12-04T10:32:19.3993391Z * [new branch] gh/kwen2501/259/base -> origin/gh/kwen2501/259/base 2025-12-04T10:32:19.3993460Z * [new branch] gh/kwen2501/259/head -> origin/gh/kwen2501/259/head 2025-12-04T10:32:19.3993529Z * [new branch] gh/kwen2501/259/orig -> origin/gh/kwen2501/259/orig 2025-12-04T10:32:19.3993628Z * [new branch] gh/kwen2501/260/base -> origin/gh/kwen2501/260/base 2025-12-04T10:32:19.3993696Z * [new branch] gh/kwen2501/260/head -> origin/gh/kwen2501/260/head 2025-12-04T10:32:19.3993765Z * [new branch] gh/kwen2501/260/orig -> origin/gh/kwen2501/260/orig 2025-12-04T10:32:19.3993836Z * [new branch] gh/kwen2501/268/base -> origin/gh/kwen2501/268/base 2025-12-04T10:32:19.3993905Z * [new branch] gh/kwen2501/268/head -> origin/gh/kwen2501/268/head 2025-12-04T10:32:19.3993974Z * [new branch] gh/kwen2501/268/orig -> origin/gh/kwen2501/268/orig 2025-12-04T10:32:19.3994044Z * [new branch] gh/kwen2501/269/base -> origin/gh/kwen2501/269/base 2025-12-04T10:32:19.3994113Z * [new branch] gh/kwen2501/269/head -> origin/gh/kwen2501/269/head 2025-12-04T10:32:19.3994182Z * [new branch] gh/kwen2501/269/orig -> origin/gh/kwen2501/269/orig 2025-12-04T10:32:19.3994253Z * [new branch] gh/kwen2501/270/base -> origin/gh/kwen2501/270/base 2025-12-04T10:32:19.3994322Z * [new branch] gh/kwen2501/270/head -> origin/gh/kwen2501/270/head 2025-12-04T10:32:19.3994391Z * [new branch] gh/kwen2501/270/orig -> origin/gh/kwen2501/270/orig 2025-12-04T10:32:19.3994464Z * [new branch] gh/kwen2501/271/base -> origin/gh/kwen2501/271/base 2025-12-04T10:32:19.3994534Z * [new branch] gh/kwen2501/271/head -> origin/gh/kwen2501/271/head 2025-12-04T10:32:19.3994605Z * [new branch] gh/kwen2501/271/orig -> origin/gh/kwen2501/271/orig 2025-12-04T10:32:19.3994675Z * [new branch] gh/kwen2501/274/base -> origin/gh/kwen2501/274/base 2025-12-04T10:32:19.3994744Z * [new branch] gh/kwen2501/274/head -> origin/gh/kwen2501/274/head 2025-12-04T10:32:19.3994815Z * [new branch] gh/kwen2501/274/orig -> origin/gh/kwen2501/274/orig 2025-12-04T10:32:19.3994886Z * [new branch] gh/kwen2501/275/base -> origin/gh/kwen2501/275/base 2025-12-04T10:32:19.3994954Z * [new branch] gh/kwen2501/275/head -> origin/gh/kwen2501/275/head 2025-12-04T10:32:19.3995024Z * [new branch] gh/kwen2501/275/orig -> origin/gh/kwen2501/275/orig 2025-12-04T10:32:19.3995093Z * [new branch] gh/kwen2501/276/base -> origin/gh/kwen2501/276/base 2025-12-04T10:32:19.3995162Z * [new branch] gh/kwen2501/276/head -> origin/gh/kwen2501/276/head 2025-12-04T10:32:19.3995262Z * [new branch] gh/kwen2501/276/orig -> origin/gh/kwen2501/276/orig 2025-12-04T10:32:19.3995331Z * [new branch] gh/kwen2501/277/base -> origin/gh/kwen2501/277/base 2025-12-04T10:32:19.3995400Z * [new branch] gh/kwen2501/277/head -> origin/gh/kwen2501/277/head 2025-12-04T10:32:19.3995471Z * [new branch] gh/kwen2501/277/orig -> origin/gh/kwen2501/277/orig 2025-12-04T10:32:19.3995542Z * [new branch] gh/kwen2501/278/base -> origin/gh/kwen2501/278/base 2025-12-04T10:32:19.3995611Z * [new branch] gh/kwen2501/278/head -> origin/gh/kwen2501/278/head 2025-12-04T10:32:19.3995682Z * [new branch] gh/kwen2501/278/orig -> origin/gh/kwen2501/278/orig 2025-12-04T10:32:19.3995751Z * [new branch] gh/kwen2501/279/base -> origin/gh/kwen2501/279/base 2025-12-04T10:32:19.3995819Z * [new branch] gh/kwen2501/279/head -> origin/gh/kwen2501/279/head 2025-12-04T10:32:19.3995892Z * [new branch] gh/kwen2501/279/orig -> origin/gh/kwen2501/279/orig 2025-12-04T10:32:19.3995962Z * [new branch] gh/kwen2501/280/base -> origin/gh/kwen2501/280/base 2025-12-04T10:32:19.3996031Z * [new branch] gh/kwen2501/280/head -> origin/gh/kwen2501/280/head 2025-12-04T10:32:19.3996137Z * [new branch] gh/kwen2501/280/orig -> origin/gh/kwen2501/280/orig 2025-12-04T10:32:19.3996205Z * [new branch] gh/kwen2501/281/base -> origin/gh/kwen2501/281/base 2025-12-04T10:32:19.3996275Z * [new branch] gh/kwen2501/281/head -> origin/gh/kwen2501/281/head 2025-12-04T10:32:19.3996343Z * [new branch] gh/kwen2501/281/orig -> origin/gh/kwen2501/281/orig 2025-12-04T10:32:19.3996412Z * [new branch] gh/kwen2501/282/base -> origin/gh/kwen2501/282/base 2025-12-04T10:32:19.3996483Z * [new branch] gh/kwen2501/282/head -> origin/gh/kwen2501/282/head 2025-12-04T10:32:19.3996553Z * [new branch] gh/kwen2501/282/orig -> origin/gh/kwen2501/282/orig 2025-12-04T10:32:19.3996622Z * [new branch] gh/kwen2501/283/base -> origin/gh/kwen2501/283/base 2025-12-04T10:32:19.3996694Z * [new branch] gh/kwen2501/283/head -> origin/gh/kwen2501/283/head 2025-12-04T10:32:19.3996765Z * [new branch] gh/kwen2501/283/orig -> origin/gh/kwen2501/283/orig 2025-12-04T10:32:19.3996834Z * [new branch] gh/kwen2501/284/base -> origin/gh/kwen2501/284/base 2025-12-04T10:32:19.3996905Z * [new branch] gh/kwen2501/284/head -> origin/gh/kwen2501/284/head 2025-12-04T10:32:19.3996975Z * [new branch] gh/kwen2501/284/orig -> origin/gh/kwen2501/284/orig 2025-12-04T10:32:19.3997044Z * [new branch] gh/kwen2501/285/base -> origin/gh/kwen2501/285/base 2025-12-04T10:32:19.3997114Z * [new branch] gh/kwen2501/285/head -> origin/gh/kwen2501/285/head 2025-12-04T10:32:19.3997184Z * [new branch] gh/kwen2501/285/orig -> origin/gh/kwen2501/285/orig 2025-12-04T10:32:19.3997254Z * [new branch] gh/kwen2501/286/base -> origin/gh/kwen2501/286/base 2025-12-04T10:32:19.3997329Z * [new branch] gh/kwen2501/286/head -> origin/gh/kwen2501/286/head 2025-12-04T10:32:19.3997400Z * [new branch] gh/kwen2501/286/orig -> origin/gh/kwen2501/286/orig 2025-12-04T10:32:19.3997469Z * [new branch] gh/kwen2501/287/base -> origin/gh/kwen2501/287/base 2025-12-04T10:32:19.3997542Z * [new branch] gh/kwen2501/287/head -> origin/gh/kwen2501/287/head 2025-12-04T10:32:19.3997611Z * [new branch] gh/kwen2501/287/orig -> origin/gh/kwen2501/287/orig 2025-12-04T10:32:19.3997683Z * [new branch] gh/kwen2501/288/base -> origin/gh/kwen2501/288/base 2025-12-04T10:32:19.3997752Z * [new branch] gh/kwen2501/288/head -> origin/gh/kwen2501/288/head 2025-12-04T10:32:19.3997848Z * [new branch] gh/kwen2501/288/orig -> origin/gh/kwen2501/288/orig 2025-12-04T10:32:19.3997927Z * [new branch] gh/laithsakka/251/base -> origin/gh/laithsakka/251/base 2025-12-04T10:32:19.3998003Z * [new branch] gh/laithsakka/251/head -> origin/gh/laithsakka/251/head 2025-12-04T10:32:19.3998079Z * [new branch] gh/laithsakka/251/orig -> origin/gh/laithsakka/251/orig 2025-12-04T10:32:19.3998155Z * [new branch] gh/laithsakka/276/base -> origin/gh/laithsakka/276/base 2025-12-04T10:32:19.3998227Z * [new branch] gh/laithsakka/276/head -> origin/gh/laithsakka/276/head 2025-12-04T10:32:19.3998300Z * [new branch] gh/laithsakka/276/orig -> origin/gh/laithsakka/276/orig 2025-12-04T10:32:19.3998375Z * [new branch] gh/laithsakka/28/base -> origin/gh/laithsakka/28/base 2025-12-04T10:32:19.3998448Z * [new branch] gh/laithsakka/29/base -> origin/gh/laithsakka/29/base 2025-12-04T10:32:19.3998524Z * [new branch] gh/laithsakka/30/base -> origin/gh/laithsakka/30/base 2025-12-04T10:32:19.3998600Z * [new branch] gh/laithsakka/30/head -> origin/gh/laithsakka/30/head 2025-12-04T10:32:19.3998671Z * [new branch] gh/laithsakka/31/base -> origin/gh/laithsakka/31/base 2025-12-04T10:32:19.3998772Z * [new branch] gh/laithsakka/31/head -> origin/gh/laithsakka/31/head 2025-12-04T10:32:19.3998849Z * [new branch] gh/laithsakka/313/base -> origin/gh/laithsakka/313/base 2025-12-04T10:32:19.3998925Z * [new branch] gh/laithsakka/313/head -> origin/gh/laithsakka/313/head 2025-12-04T10:32:19.3998999Z * [new branch] gh/laithsakka/313/orig -> origin/gh/laithsakka/313/orig 2025-12-04T10:32:19.3999074Z * [new branch] gh/laithsakka/316/base -> origin/gh/laithsakka/316/base 2025-12-04T10:32:19.3999149Z * [new branch] gh/laithsakka/316/head -> origin/gh/laithsakka/316/head 2025-12-04T10:32:19.3999223Z * [new branch] gh/laithsakka/316/orig -> origin/gh/laithsakka/316/orig 2025-12-04T10:32:19.3999296Z * [new branch] gh/laithsakka/317/base -> origin/gh/laithsakka/317/base 2025-12-04T10:32:19.3999370Z * [new branch] gh/laithsakka/317/head -> origin/gh/laithsakka/317/head 2025-12-04T10:32:19.3999447Z * [new branch] gh/laithsakka/317/orig -> origin/gh/laithsakka/317/orig 2025-12-04T10:32:19.3999520Z * [new branch] gh/laithsakka/319/base -> origin/gh/laithsakka/319/base 2025-12-04T10:32:19.3999623Z * [new branch] gh/laithsakka/319/head -> origin/gh/laithsakka/319/head 2025-12-04T10:32:19.3999698Z * [new branch] gh/laithsakka/319/orig -> origin/gh/laithsakka/319/orig 2025-12-04T10:32:19.3999771Z * [new branch] gh/laithsakka/32/base -> origin/gh/laithsakka/32/base 2025-12-04T10:32:19.3999847Z * [new branch] gh/laithsakka/32/head -> origin/gh/laithsakka/32/head 2025-12-04T10:32:19.3999922Z * [new branch] gh/laithsakka/320/base -> origin/gh/laithsakka/320/base 2025-12-04T10:32:19.3999994Z * [new branch] gh/laithsakka/320/head -> origin/gh/laithsakka/320/head 2025-12-04T10:32:19.4000069Z * [new branch] gh/laithsakka/320/orig -> origin/gh/laithsakka/320/orig 2025-12-04T10:32:19.4000144Z * [new branch] gh/laithsakka/321/base -> origin/gh/laithsakka/321/base 2025-12-04T10:32:19.4000217Z * [new branch] gh/laithsakka/321/head -> origin/gh/laithsakka/321/head 2025-12-04T10:32:19.4000290Z * [new branch] gh/laithsakka/321/orig -> origin/gh/laithsakka/321/orig 2025-12-04T10:32:19.4000366Z * [new branch] gh/laithsakka/322/base -> origin/gh/laithsakka/322/base 2025-12-04T10:32:19.4000440Z * [new branch] gh/laithsakka/322/head -> origin/gh/laithsakka/322/head 2025-12-04T10:32:19.4000555Z * [new branch] gh/laithsakka/322/orig -> origin/gh/laithsakka/322/orig 2025-12-04T10:32:19.4000631Z * [new branch] gh/laithsakka/323/base -> origin/gh/laithsakka/323/base 2025-12-04T10:32:19.4000704Z * [new branch] gh/laithsakka/323/head -> origin/gh/laithsakka/323/head 2025-12-04T10:32:19.4000780Z * [new branch] gh/laithsakka/323/orig -> origin/gh/laithsakka/323/orig 2025-12-04T10:32:19.4000854Z * [new branch] gh/laithsakka/324/base -> origin/gh/laithsakka/324/base 2025-12-04T10:32:19.4000927Z * [new branch] gh/laithsakka/324/head -> origin/gh/laithsakka/324/head 2025-12-04T10:32:19.4001002Z * [new branch] gh/laithsakka/324/orig -> origin/gh/laithsakka/324/orig 2025-12-04T10:32:19.4001073Z * [new branch] gh/laithsakka/325/base -> origin/gh/laithsakka/325/base 2025-12-04T10:32:19.4001145Z * [new branch] gh/laithsakka/325/head -> origin/gh/laithsakka/325/head 2025-12-04T10:32:19.4001221Z * [new branch] gh/laithsakka/325/orig -> origin/gh/laithsakka/325/orig 2025-12-04T10:32:19.4001294Z * [new branch] gh/laithsakka/326/base -> origin/gh/laithsakka/326/base 2025-12-04T10:32:19.4001367Z * [new branch] gh/laithsakka/326/head -> origin/gh/laithsakka/326/head 2025-12-04T10:32:19.4001485Z * [new branch] gh/laithsakka/326/orig -> origin/gh/laithsakka/326/orig 2025-12-04T10:32:19.4001559Z * [new branch] gh/laithsakka/327/base -> origin/gh/laithsakka/327/base 2025-12-04T10:32:19.4001630Z * [new branch] gh/laithsakka/327/head -> origin/gh/laithsakka/327/head 2025-12-04T10:32:19.4001705Z * [new branch] gh/laithsakka/327/orig -> origin/gh/laithsakka/327/orig 2025-12-04T10:32:19.4001779Z * [new branch] gh/laithsakka/328/base -> origin/gh/laithsakka/328/base 2025-12-04T10:32:19.4001857Z * [new branch] gh/laithsakka/328/head -> origin/gh/laithsakka/328/head 2025-12-04T10:32:19.4001933Z * [new branch] gh/laithsakka/328/orig -> origin/gh/laithsakka/328/orig 2025-12-04T10:32:19.4002003Z * [new branch] gh/liangel/4/base -> origin/gh/liangel/4/base 2025-12-04T10:32:19.4002072Z * [new branch] gh/liangel/4/head -> origin/gh/liangel/4/head 2025-12-04T10:32:19.4002141Z * [new branch] gh/liangel/4/orig -> origin/gh/liangel/4/orig 2025-12-04T10:32:19.4002217Z * [new branch] gh/lucaskabela/1/base -> origin/gh/lucaskabela/1/base 2025-12-04T10:32:19.4002291Z * [new branch] gh/lucaskabela/1/head -> origin/gh/lucaskabela/1/head 2025-12-04T10:32:19.4002358Z * [new branch] gh/lw/4/base -> origin/gh/lw/4/base 2025-12-04T10:32:19.4002421Z * [new branch] gh/lw/4/head -> origin/gh/lw/4/head 2025-12-04T10:32:19.4002487Z * [new branch] gh/lw/4/orig -> origin/gh/lw/4/orig 2025-12-04T10:32:19.4002548Z * [new branch] gh/lw/5/base -> origin/gh/lw/5/base 2025-12-04T10:32:19.4002609Z * [new branch] gh/lw/5/head -> origin/gh/lw/5/head 2025-12-04T10:32:19.4002669Z * [new branch] gh/lw/5/orig -> origin/gh/lw/5/orig 2025-12-04T10:32:19.4002731Z * [new branch] gh/lw/6/base -> origin/gh/lw/6/base 2025-12-04T10:32:19.4002791Z * [new branch] gh/lw/6/head -> origin/gh/lw/6/head 2025-12-04T10:32:19.4002856Z * [new branch] gh/lw/6/orig -> origin/gh/lw/6/orig 2025-12-04T10:32:19.4002923Z * [new branch] gh/malfet/14/base -> origin/gh/malfet/14/base 2025-12-04T10:32:19.4002995Z * [new branch] gh/malfet/417/base -> origin/gh/malfet/417/base 2025-12-04T10:32:19.4003066Z * [new branch] gh/malfet/417/head -> origin/gh/malfet/417/head 2025-12-04T10:32:19.4003164Z * [new branch] gh/malfet/417/orig -> origin/gh/malfet/417/orig 2025-12-04T10:32:19.4003233Z * [new branch] gh/malfet/506/base -> origin/gh/malfet/506/base 2025-12-04T10:32:19.4003302Z * [new branch] gh/malfet/506/head -> origin/gh/malfet/506/head 2025-12-04T10:32:19.4003371Z * [new branch] gh/malfet/506/orig -> origin/gh/malfet/506/orig 2025-12-04T10:32:19.4003438Z * [new branch] gh/malfet/517/base -> origin/gh/malfet/517/base 2025-12-04T10:32:19.4003505Z * [new branch] gh/malfet/517/head -> origin/gh/malfet/517/head 2025-12-04T10:32:19.4003572Z * [new branch] gh/malfet/528/base -> origin/gh/malfet/528/base 2025-12-04T10:32:19.4003638Z * [new branch] gh/malfet/528/head -> origin/gh/malfet/528/head 2025-12-04T10:32:19.4003706Z * [new branch] gh/malfet/528/orig -> origin/gh/malfet/528/orig 2025-12-04T10:32:19.4003774Z * [new branch] gh/malfet/537/base -> origin/gh/malfet/537/base 2025-12-04T10:32:19.4003841Z * [new branch] gh/malfet/537/head -> origin/gh/malfet/537/head 2025-12-04T10:32:19.4003911Z * [new branch] gh/malfet/537/orig -> origin/gh/malfet/537/orig 2025-12-04T10:32:19.4004009Z * [new branch] gh/malfet/546/base -> origin/gh/malfet/546/base 2025-12-04T10:32:19.4004077Z * [new branch] gh/malfet/546/head -> origin/gh/malfet/546/head 2025-12-04T10:32:19.4004145Z * [new branch] gh/malfet/546/orig -> origin/gh/malfet/546/orig 2025-12-04T10:32:19.4004212Z * [new branch] gh/malfet/565/base -> origin/gh/malfet/565/base 2025-12-04T10:32:19.4004280Z * [new branch] gh/malfet/565/head -> origin/gh/malfet/565/head 2025-12-04T10:32:19.4004345Z * [new branch] gh/malfet/565/orig -> origin/gh/malfet/565/orig 2025-12-04T10:32:19.4004414Z * [new branch] gh/malfet/575/base -> origin/gh/malfet/575/base 2025-12-04T10:32:19.4004484Z * [new branch] gh/malfet/575/head -> origin/gh/malfet/575/head 2025-12-04T10:32:19.4004550Z * [new branch] gh/malfet/575/orig -> origin/gh/malfet/575/orig 2025-12-04T10:32:19.4004619Z * [new branch] gh/malfet/580/base -> origin/gh/malfet/580/base 2025-12-04T10:32:19.4004687Z * [new branch] gh/malfet/580/head -> origin/gh/malfet/580/head 2025-12-04T10:32:19.4004753Z * [new branch] gh/malfet/580/orig -> origin/gh/malfet/580/orig 2025-12-04T10:32:19.4004820Z * [new branch] gh/malfet/581/base -> origin/gh/malfet/581/base 2025-12-04T10:32:19.4004888Z * [new branch] gh/malfet/581/head -> origin/gh/malfet/581/head 2025-12-04T10:32:19.4004954Z * [new branch] gh/malfet/581/orig -> origin/gh/malfet/581/orig 2025-12-04T10:32:19.4005022Z * [new branch] gh/malfet/583/base -> origin/gh/malfet/583/base 2025-12-04T10:32:19.4005093Z * [new branch] gh/malfet/583/head -> origin/gh/malfet/583/head 2025-12-04T10:32:19.4005159Z * [new branch] gh/malfet/583/orig -> origin/gh/malfet/583/orig 2025-12-04T10:32:19.4005225Z * [new branch] gh/malfet/586/base -> origin/gh/malfet/586/base 2025-12-04T10:32:19.4005295Z * [new branch] gh/malfet/586/head -> origin/gh/malfet/586/head 2025-12-04T10:32:19.4005362Z * [new branch] gh/malfet/586/orig -> origin/gh/malfet/586/orig 2025-12-04T10:32:19.4005429Z * [new branch] gh/malfet/587/base -> origin/gh/malfet/587/base 2025-12-04T10:32:19.4005497Z * [new branch] gh/malfet/587/head -> origin/gh/malfet/587/head 2025-12-04T10:32:19.4005563Z * [new branch] gh/malfet/587/orig -> origin/gh/malfet/587/orig 2025-12-04T10:32:19.4005668Z * [new branch] gh/malfet/588/base -> origin/gh/malfet/588/base 2025-12-04T10:32:19.4005736Z * [new branch] gh/malfet/588/head -> origin/gh/malfet/588/head 2025-12-04T10:32:19.4005801Z * [new branch] gh/malfet/588/orig -> origin/gh/malfet/588/orig 2025-12-04T10:32:19.4005869Z * [new branch] gh/malfet/589/base -> origin/gh/malfet/589/base 2025-12-04T10:32:19.4005935Z * [new branch] gh/malfet/589/head -> origin/gh/malfet/589/head 2025-12-04T10:32:19.4006003Z * [new branch] gh/malfet/589/orig -> origin/gh/malfet/589/orig 2025-12-04T10:32:19.4006070Z * [new branch] gh/malfet/590/base -> origin/gh/malfet/590/base 2025-12-04T10:32:19.4006137Z * [new branch] gh/malfet/590/head -> origin/gh/malfet/590/head 2025-12-04T10:32:19.4006205Z * [new branch] gh/malfet/590/orig -> origin/gh/malfet/590/orig 2025-12-04T10:32:19.4006276Z * [new branch] gh/malfet/591/base -> origin/gh/malfet/591/base 2025-12-04T10:32:19.4006341Z * [new branch] gh/malfet/591/head -> origin/gh/malfet/591/head 2025-12-04T10:32:19.4006406Z * [new branch] gh/malfet/591/orig -> origin/gh/malfet/591/orig 2025-12-04T10:32:19.4006475Z * [new branch] gh/malfet/592/base -> origin/gh/malfet/592/base 2025-12-04T10:32:19.4006567Z * [new branch] gh/malfet/592/head -> origin/gh/malfet/592/head 2025-12-04T10:32:19.4006635Z * [new branch] gh/malfet/592/orig -> origin/gh/malfet/592/orig 2025-12-04T10:32:19.4006704Z * [new branch] gh/malfet/593/base -> origin/gh/malfet/593/base 2025-12-04T10:32:19.4006771Z * [new branch] gh/malfet/593/head -> origin/gh/malfet/593/head 2025-12-04T10:32:19.4006839Z * [new branch] gh/malfet/593/orig -> origin/gh/malfet/593/orig 2025-12-04T10:32:19.4006910Z * [new branch] gh/malfet/594/base -> origin/gh/malfet/594/base 2025-12-04T10:32:19.4006976Z * [new branch] gh/malfet/594/head -> origin/gh/malfet/594/head 2025-12-04T10:32:19.4007043Z * [new branch] gh/malfet/594/orig -> origin/gh/malfet/594/orig 2025-12-04T10:32:19.4007111Z * [new branch] gh/malfet/595/base -> origin/gh/malfet/595/base 2025-12-04T10:32:19.4007179Z * [new branch] gh/malfet/595/head -> origin/gh/malfet/595/head 2025-12-04T10:32:19.4007246Z * [new branch] gh/malfet/595/orig -> origin/gh/malfet/595/orig 2025-12-04T10:32:19.4007317Z * [new branch] gh/malfet/596/base -> origin/gh/malfet/596/base 2025-12-04T10:32:19.4007384Z * [new branch] gh/malfet/596/head -> origin/gh/malfet/596/head 2025-12-04T10:32:19.4007453Z * [new branch] gh/malfet/596/orig -> origin/gh/malfet/596/orig 2025-12-04T10:32:19.4007521Z * [new branch] gh/malfet/597/base -> origin/gh/malfet/597/base 2025-12-04T10:32:19.4007587Z * [new branch] gh/malfet/597/head -> origin/gh/malfet/597/head 2025-12-04T10:32:19.4007655Z * [new branch] gh/malfet/597/orig -> origin/gh/malfet/597/orig 2025-12-04T10:32:19.4007722Z * [new branch] gh/malfet/598/base -> origin/gh/malfet/598/base 2025-12-04T10:32:19.4007790Z * [new branch] gh/malfet/598/head -> origin/gh/malfet/598/head 2025-12-04T10:32:19.4007859Z * [new branch] gh/malfet/598/orig -> origin/gh/malfet/598/orig 2025-12-04T10:32:19.4007924Z * [new branch] gh/malfet/599/base -> origin/gh/malfet/599/base 2025-12-04T10:32:19.4007991Z * [new branch] gh/malfet/599/head -> origin/gh/malfet/599/head 2025-12-04T10:32:19.4008057Z * [new branch] gh/malfet/599/orig -> origin/gh/malfet/599/orig 2025-12-04T10:32:19.4008153Z * [new branch] gh/malfet/600/base -> origin/gh/malfet/600/base 2025-12-04T10:32:19.4008220Z * [new branch] gh/malfet/600/head -> origin/gh/malfet/600/head 2025-12-04T10:32:19.4008289Z * [new branch] gh/malfet/600/orig -> origin/gh/malfet/600/orig 2025-12-04T10:32:19.4008355Z * [new branch] gh/malfet/601/base -> origin/gh/malfet/601/base 2025-12-04T10:32:19.4008425Z * [new branch] gh/malfet/601/head -> origin/gh/malfet/601/head 2025-12-04T10:32:19.4008493Z * [new branch] gh/malfet/601/orig -> origin/gh/malfet/601/orig 2025-12-04T10:32:19.4008560Z * [new branch] gh/malfet/602/base -> origin/gh/malfet/602/base 2025-12-04T10:32:19.4008626Z * [new branch] gh/malfet/602/head -> origin/gh/malfet/602/head 2025-12-04T10:32:19.4008697Z * [new branch] gh/malfet/602/orig -> origin/gh/malfet/602/orig 2025-12-04T10:32:19.4008766Z * [new branch] gh/malfet/603/base -> origin/gh/malfet/603/base 2025-12-04T10:32:19.4008834Z * [new branch] gh/malfet/603/head -> origin/gh/malfet/603/head 2025-12-04T10:32:19.4008903Z * [new branch] gh/malfet/603/orig -> origin/gh/malfet/603/orig 2025-12-04T10:32:19.4008970Z * [new branch] gh/malfet/604/base -> origin/gh/malfet/604/base 2025-12-04T10:32:19.4009064Z * [new branch] gh/malfet/604/head -> origin/gh/malfet/604/head 2025-12-04T10:32:19.4009132Z * [new branch] gh/malfet/604/orig -> origin/gh/malfet/604/orig 2025-12-04T10:32:19.4009197Z * [new branch] gh/malfet/605/base -> origin/gh/malfet/605/base 2025-12-04T10:32:19.4009267Z * [new branch] gh/malfet/605/head -> origin/gh/malfet/605/head 2025-12-04T10:32:19.4009334Z * [new branch] gh/malfet/605/orig -> origin/gh/malfet/605/orig 2025-12-04T10:32:19.4009403Z * [new branch] gh/malfet/606/base -> origin/gh/malfet/606/base 2025-12-04T10:32:19.4009476Z * [new branch] gh/malfet/606/head -> origin/gh/malfet/606/head 2025-12-04T10:32:19.4009544Z * [new branch] gh/malfet/606/orig -> origin/gh/malfet/606/orig 2025-12-04T10:32:19.4009646Z * [new branch] gh/malfet/607/base -> origin/gh/malfet/607/base 2025-12-04T10:32:19.4009715Z * [new branch] gh/malfet/607/head -> origin/gh/malfet/607/head 2025-12-04T10:32:19.4009782Z * [new branch] gh/malfet/607/orig -> origin/gh/malfet/607/orig 2025-12-04T10:32:19.4009848Z * [new branch] gh/malfet/608/base -> origin/gh/malfet/608/base 2025-12-04T10:32:19.4009916Z * [new branch] gh/malfet/608/head -> origin/gh/malfet/608/head 2025-12-04T10:32:19.4009983Z * [new branch] gh/malfet/608/orig -> origin/gh/malfet/608/orig 2025-12-04T10:32:19.4010050Z * [new branch] gh/malfet/609/base -> origin/gh/malfet/609/base 2025-12-04T10:32:19.4010123Z * [new branch] gh/malfet/609/head -> origin/gh/malfet/609/head 2025-12-04T10:32:19.4010190Z * [new branch] gh/malfet/609/orig -> origin/gh/malfet/609/orig 2025-12-04T10:32:19.4010256Z * [new branch] gh/malfet/610/base -> origin/gh/malfet/610/base 2025-12-04T10:32:19.4010326Z * [new branch] gh/malfet/610/head -> origin/gh/malfet/610/head 2025-12-04T10:32:19.4010393Z * [new branch] gh/malfet/610/orig -> origin/gh/malfet/610/orig 2025-12-04T10:32:19.4010459Z * [new branch] gh/malfet/611/base -> origin/gh/malfet/611/base 2025-12-04T10:32:19.4010528Z * [new branch] gh/malfet/611/head -> origin/gh/malfet/611/head 2025-12-04T10:32:19.4010595Z * [new branch] gh/malfet/611/orig -> origin/gh/malfet/611/orig 2025-12-04T10:32:19.4010662Z * [new branch] gh/malfet/612/base -> origin/gh/malfet/612/base 2025-12-04T10:32:19.4010781Z * [new branch] gh/malfet/612/head -> origin/gh/malfet/612/head 2025-12-04T10:32:19.4010848Z * [new branch] gh/malfet/612/orig -> origin/gh/malfet/612/orig 2025-12-04T10:32:19.4010916Z * [new branch] gh/malfet/64/base -> origin/gh/malfet/64/base 2025-12-04T10:32:19.4010985Z * [new branch] gh/malfet/64/head -> origin/gh/malfet/64/head 2025-12-04T10:32:19.4011075Z * [new branch] gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base 2025-12-04T10:32:19.4011164Z * [new branch] gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head 2025-12-04T10:32:19.4011249Z * [new branch] gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig 2025-12-04T10:32:19.4011319Z * [new branch] gh/markkm/1/base -> origin/gh/markkm/1/base 2025-12-04T10:32:19.4011395Z * [new branch] gh/masnesral/1/base -> origin/gh/masnesral/1/base 2025-12-04T10:32:19.4011468Z * [new branch] gh/masnesral/1/head -> origin/gh/masnesral/1/head 2025-12-04T10:32:19.4011539Z * [new branch] gh/masnesral/1/orig -> origin/gh/masnesral/1/orig 2025-12-04T10:32:19.4011610Z * [new branch] gh/mhorowitz/0/base -> origin/gh/mhorowitz/0/base 2025-12-04T10:32:19.4011724Z * [new branch] gh/mhorowitz/0/head -> origin/gh/mhorowitz/0/head 2025-12-04T10:32:19.4011794Z * [new branch] gh/mhorowitz/1/base -> origin/gh/mhorowitz/1/base 2025-12-04T10:32:19.4011867Z * [new branch] gh/mhorowitz/1/head -> origin/gh/mhorowitz/1/head 2025-12-04T10:32:19.4011937Z * [new branch] gh/mhorowitz/2/base -> origin/gh/mhorowitz/2/base 2025-12-04T10:32:19.4012006Z * [new branch] gh/mhorowitz/2/head -> origin/gh/mhorowitz/2/head 2025-12-04T10:32:19.4012080Z * [new branch] gh/mhorowitz/3/base -> origin/gh/mhorowitz/3/base 2025-12-04T10:32:19.4012150Z * [new branch] gh/mhorowitz/3/head -> origin/gh/mhorowitz/3/head 2025-12-04T10:32:19.4012219Z * [new branch] gh/mhorowitz/4/base -> origin/gh/mhorowitz/4/base 2025-12-04T10:32:19.4012293Z * [new branch] gh/mhorowitz/4/head -> origin/gh/mhorowitz/4/head 2025-12-04T10:32:19.4012362Z * [new branch] gh/mhorowitz/5/base -> origin/gh/mhorowitz/5/base 2025-12-04T10:32:19.4012433Z * [new branch] gh/mhorowitz/5/head -> origin/gh/mhorowitz/5/head 2025-12-04T10:32:19.4012505Z * [new branch] gh/mhorowitz/6/base -> origin/gh/mhorowitz/6/base 2025-12-04T10:32:19.4012575Z * [new branch] gh/mhorowitz/6/head -> origin/gh/mhorowitz/6/head 2025-12-04T10:32:19.4012677Z * [new branch] gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base 2025-12-04T10:32:19.4012774Z * [new branch] gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head 2025-12-04T10:32:19.4012867Z * [new branch] gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base 2025-12-04T10:32:19.4012962Z * [new branch] gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head 2025-12-04T10:32:19.4013055Z * [new branch] gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base 2025-12-04T10:32:19.4013147Z * [new branch] gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head 2025-12-04T10:32:19.4013240Z * [new branch] gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base 2025-12-04T10:32:19.4013332Z * [new branch] gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head 2025-12-04T10:32:19.4013422Z * [new branch] gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base 2025-12-04T10:32:19.4013545Z * [new branch] gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head 2025-12-04T10:32:19.4013635Z * [new branch] gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base 2025-12-04T10:32:19.4013726Z * [new branch] gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head 2025-12-04T10:32:19.4013821Z * [new branch] gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig 2025-12-04T10:32:19.4013912Z * [new branch] gh/mikaylagawarecki/341/base -> origin/gh/mikaylagawarecki/341/base 2025-12-04T10:32:19.4014003Z * [new branch] gh/mikaylagawarecki/341/head -> origin/gh/mikaylagawarecki/341/head 2025-12-04T10:32:19.4014095Z * [new branch] gh/mikaylagawarecki/341/orig -> origin/gh/mikaylagawarecki/341/orig 2025-12-04T10:32:19.4014187Z * [new branch] gh/mikaylagawarecki/342/base -> origin/gh/mikaylagawarecki/342/base 2025-12-04T10:32:19.4014280Z * [new branch] gh/mikaylagawarecki/342/head -> origin/gh/mikaylagawarecki/342/head 2025-12-04T10:32:19.4014371Z * [new branch] gh/mikaylagawarecki/342/orig -> origin/gh/mikaylagawarecki/342/orig 2025-12-04T10:32:19.4014461Z * [new branch] gh/mikaylagawarecki/345/base -> origin/gh/mikaylagawarecki/345/base 2025-12-04T10:32:19.4014581Z * [new branch] gh/mikaylagawarecki/345/head -> origin/gh/mikaylagawarecki/345/head 2025-12-04T10:32:19.4014672Z * [new branch] gh/mikaylagawarecki/345/orig -> origin/gh/mikaylagawarecki/345/orig 2025-12-04T10:32:19.4014762Z * [new branch] gh/mikaylagawarecki/346/base -> origin/gh/mikaylagawarecki/346/base 2025-12-04T10:32:19.4014856Z * [new branch] gh/mikaylagawarecki/346/head -> origin/gh/mikaylagawarecki/346/head 2025-12-04T10:32:19.4014950Z * [new branch] gh/mikaylagawarecki/346/orig -> origin/gh/mikaylagawarecki/346/orig 2025-12-04T10:32:19.4015043Z * [new branch] gh/mikaylagawarecki/347/base -> origin/gh/mikaylagawarecki/347/base 2025-12-04T10:32:19.4015137Z * [new branch] gh/mikaylagawarecki/347/head -> origin/gh/mikaylagawarecki/347/head 2025-12-04T10:32:19.4015228Z * [new branch] gh/mikaylagawarecki/347/orig -> origin/gh/mikaylagawarecki/347/orig 2025-12-04T10:32:19.4015320Z * [new branch] gh/mikaylagawarecki/350/base -> origin/gh/mikaylagawarecki/350/base 2025-12-04T10:32:19.4015414Z * [new branch] gh/mikaylagawarecki/350/head -> origin/gh/mikaylagawarecki/350/head 2025-12-04T10:32:19.4015505Z * [new branch] gh/mikaylagawarecki/350/orig -> origin/gh/mikaylagawarecki/350/orig 2025-12-04T10:32:19.4015597Z * [new branch] gh/mikaylagawarecki/351/base -> origin/gh/mikaylagawarecki/351/base 2025-12-04T10:32:19.4015686Z * [new branch] gh/mikaylagawarecki/351/head -> origin/gh/mikaylagawarecki/351/head 2025-12-04T10:32:19.4015779Z * [new branch] gh/mikaylagawarecki/351/orig -> origin/gh/mikaylagawarecki/351/orig 2025-12-04T10:32:19.4015875Z * [new branch] gh/mikaylagawarecki/352/base -> origin/gh/mikaylagawarecki/352/base 2025-12-04T10:32:19.4015966Z * [new branch] gh/mikaylagawarecki/352/head -> origin/gh/mikaylagawarecki/352/head 2025-12-04T10:32:19.4016057Z * [new branch] gh/mikaylagawarecki/352/orig -> origin/gh/mikaylagawarecki/352/orig 2025-12-04T10:32:19.4016151Z * [new branch] gh/mikaylagawarecki/353/base -> origin/gh/mikaylagawarecki/353/base 2025-12-04T10:32:19.4016242Z * [new branch] gh/mikaylagawarecki/353/head -> origin/gh/mikaylagawarecki/353/head 2025-12-04T10:32:19.4016334Z * [new branch] gh/mikaylagawarecki/353/orig -> origin/gh/mikaylagawarecki/353/orig 2025-12-04T10:32:19.4016427Z * [new branch] gh/mikaylagawarecki/354/base -> origin/gh/mikaylagawarecki/354/base 2025-12-04T10:32:19.4016553Z * [new branch] gh/mikaylagawarecki/354/head -> origin/gh/mikaylagawarecki/354/head 2025-12-04T10:32:19.4016645Z * [new branch] gh/mikaylagawarecki/354/orig -> origin/gh/mikaylagawarecki/354/orig 2025-12-04T10:32:19.4016739Z * [new branch] gh/mikaylagawarecki/356/base -> origin/gh/mikaylagawarecki/356/base 2025-12-04T10:32:19.4016829Z * [new branch] gh/mikaylagawarecki/356/head -> origin/gh/mikaylagawarecki/356/head 2025-12-04T10:32:19.4016922Z * [new branch] gh/mikaylagawarecki/356/orig -> origin/gh/mikaylagawarecki/356/orig 2025-12-04T10:32:19.4017013Z * [new branch] gh/mikaylagawarecki/357/base -> origin/gh/mikaylagawarecki/357/base 2025-12-04T10:32:19.4017103Z * [new branch] gh/mikaylagawarecki/357/head -> origin/gh/mikaylagawarecki/357/head 2025-12-04T10:32:19.4017197Z * [new branch] gh/mikaylagawarecki/357/orig -> origin/gh/mikaylagawarecki/357/orig 2025-12-04T10:32:19.4017290Z * [new branch] gh/mikaylagawarecki/359/base -> origin/gh/mikaylagawarecki/359/base 2025-12-04T10:32:19.4017381Z * [new branch] gh/mikaylagawarecki/359/head -> origin/gh/mikaylagawarecki/359/head 2025-12-04T10:32:19.4017475Z * [new branch] gh/mikaylagawarecki/359/orig -> origin/gh/mikaylagawarecki/359/orig 2025-12-04T10:32:19.4017589Z * [new branch] gh/mikaylagawarecki/360/base -> origin/gh/mikaylagawarecki/360/base 2025-12-04T10:32:19.4017681Z * [new branch] gh/mikaylagawarecki/360/head -> origin/gh/mikaylagawarecki/360/head 2025-12-04T10:32:19.4017775Z * [new branch] gh/mikaylagawarecki/360/orig -> origin/gh/mikaylagawarecki/360/orig 2025-12-04T10:32:19.4017868Z * [new branch] gh/mikaylagawarecki/361/base -> origin/gh/mikaylagawarecki/361/base 2025-12-04T10:32:19.4017960Z * [new branch] gh/mikaylagawarecki/361/head -> origin/gh/mikaylagawarecki/361/head 2025-12-04T10:32:19.4018054Z * [new branch] gh/mikaylagawarecki/361/orig -> origin/gh/mikaylagawarecki/361/orig 2025-12-04T10:32:19.4018146Z * [new branch] gh/mikaylagawarecki/362/base -> origin/gh/mikaylagawarecki/362/base 2025-12-04T10:32:19.4018238Z * [new branch] gh/mikaylagawarecki/362/head -> origin/gh/mikaylagawarecki/362/head 2025-12-04T10:32:19.4018332Z * [new branch] gh/mikaylagawarecki/362/orig -> origin/gh/mikaylagawarecki/362/orig 2025-12-04T10:32:19.4018423Z * [new branch] gh/mikaylagawarecki/363/base -> origin/gh/mikaylagawarecki/363/base 2025-12-04T10:32:19.4018515Z * [new branch] gh/mikaylagawarecki/363/head -> origin/gh/mikaylagawarecki/363/head 2025-12-04T10:32:19.4018606Z * [new branch] gh/mikaylagawarecki/363/orig -> origin/gh/mikaylagawarecki/363/orig 2025-12-04T10:32:19.4018698Z * [new branch] gh/mikaylagawarecki/364/base -> origin/gh/mikaylagawarecki/364/base 2025-12-04T10:32:19.4018793Z * [new branch] gh/mikaylagawarecki/364/head -> origin/gh/mikaylagawarecki/364/head 2025-12-04T10:32:19.4018885Z * [new branch] gh/mikaylagawarecki/364/orig -> origin/gh/mikaylagawarecki/364/orig 2025-12-04T10:32:19.4018977Z * [new branch] gh/mikaylagawarecki/365/base -> origin/gh/mikaylagawarecki/365/base 2025-12-04T10:32:19.4019073Z * [new branch] gh/mikaylagawarecki/365/head -> origin/gh/mikaylagawarecki/365/head 2025-12-04T10:32:19.4019165Z * [new branch] gh/mikaylagawarecki/365/orig -> origin/gh/mikaylagawarecki/365/orig 2025-12-04T10:32:19.4019256Z * [new branch] gh/mikaylagawarecki/366/base -> origin/gh/mikaylagawarecki/366/base 2025-12-04T10:32:19.4019348Z * [new branch] gh/mikaylagawarecki/366/head -> origin/gh/mikaylagawarecki/366/head 2025-12-04T10:32:19.4019439Z * [new branch] gh/mikaylagawarecki/366/orig -> origin/gh/mikaylagawarecki/366/orig 2025-12-04T10:32:19.4019558Z * [new branch] gh/mikaylagawarecki/367/base -> origin/gh/mikaylagawarecki/367/base 2025-12-04T10:32:19.4019691Z * [new branch] gh/mikaylagawarecki/367/head -> origin/gh/mikaylagawarecki/367/head 2025-12-04T10:32:19.4019783Z * [new branch] gh/mikaylagawarecki/367/orig -> origin/gh/mikaylagawarecki/367/orig 2025-12-04T10:32:19.4019878Z * [new branch] gh/mikaylagawarecki/368/base -> origin/gh/mikaylagawarecki/368/base 2025-12-04T10:32:19.4019968Z * [new branch] gh/mikaylagawarecki/368/head -> origin/gh/mikaylagawarecki/368/head 2025-12-04T10:32:19.4020060Z * [new branch] gh/mikaylagawarecki/368/orig -> origin/gh/mikaylagawarecki/368/orig 2025-12-04T10:32:19.4020154Z * [new branch] gh/mikaylagawarecki/369/base -> origin/gh/mikaylagawarecki/369/base 2025-12-04T10:32:19.4020246Z * [new branch] gh/mikaylagawarecki/369/head -> origin/gh/mikaylagawarecki/369/head 2025-12-04T10:32:19.4020339Z * [new branch] gh/mikaylagawarecki/369/orig -> origin/gh/mikaylagawarecki/369/orig 2025-12-04T10:32:19.4020430Z * [new branch] gh/mikaylagawarecki/370/base -> origin/gh/mikaylagawarecki/370/base 2025-12-04T10:32:19.4020522Z * [new branch] gh/mikaylagawarecki/370/head -> origin/gh/mikaylagawarecki/370/head 2025-12-04T10:32:19.4020664Z * [new branch] gh/mikaylagawarecki/370/orig -> origin/gh/mikaylagawarecki/370/orig 2025-12-04T10:32:19.4020758Z * [new branch] gh/mikaylagawarecki/371/base -> origin/gh/mikaylagawarecki/371/base 2025-12-04T10:32:19.4020849Z * [new branch] gh/mikaylagawarecki/371/head -> origin/gh/mikaylagawarecki/371/head 2025-12-04T10:32:19.4020941Z * [new branch] gh/mikaylagawarecki/371/orig -> origin/gh/mikaylagawarecki/371/orig 2025-12-04T10:32:19.4021033Z * [new branch] gh/mikaylagawarecki/372/base -> origin/gh/mikaylagawarecki/372/base 2025-12-04T10:32:19.4021124Z * [new branch] gh/mikaylagawarecki/372/head -> origin/gh/mikaylagawarecki/372/head 2025-12-04T10:32:19.4021215Z * [new branch] gh/mikaylagawarecki/372/orig -> origin/gh/mikaylagawarecki/372/orig 2025-12-04T10:32:19.4021309Z * [new branch] gh/mikaylagawarecki/373/base -> origin/gh/mikaylagawarecki/373/base 2025-12-04T10:32:19.4021400Z * [new branch] gh/mikaylagawarecki/373/head -> origin/gh/mikaylagawarecki/373/head 2025-12-04T10:32:19.4021491Z * [new branch] gh/mikaylagawarecki/373/orig -> origin/gh/mikaylagawarecki/373/orig 2025-12-04T10:32:19.4021583Z * [new branch] gh/mikaylagawarecki/374/base -> origin/gh/mikaylagawarecki/374/base 2025-12-04T10:32:19.4021674Z * [new branch] gh/mikaylagawarecki/374/head -> origin/gh/mikaylagawarecki/374/head 2025-12-04T10:32:19.4021765Z * [new branch] gh/mikaylagawarecki/374/orig -> origin/gh/mikaylagawarecki/374/orig 2025-12-04T10:32:19.4021857Z * [new branch] gh/mikaylagawarecki/375/base -> origin/gh/mikaylagawarecki/375/base 2025-12-04T10:32:19.4021948Z * [new branch] gh/mikaylagawarecki/375/head -> origin/gh/mikaylagawarecki/375/head 2025-12-04T10:32:19.4022042Z * [new branch] gh/mikaylagawarecki/375/orig -> origin/gh/mikaylagawarecki/375/orig 2025-12-04T10:32:19.4022132Z * [new branch] gh/mikaylagawarecki/376/base -> origin/gh/mikaylagawarecki/376/base 2025-12-04T10:32:19.4022223Z * [new branch] gh/mikaylagawarecki/376/head -> origin/gh/mikaylagawarecki/376/head 2025-12-04T10:32:19.4022316Z * [new branch] gh/mikaylagawarecki/376/orig -> origin/gh/mikaylagawarecki/376/orig 2025-12-04T10:32:19.4022407Z * [new branch] gh/mikaylagawarecki/377/base -> origin/gh/mikaylagawarecki/377/base 2025-12-04T10:32:19.4022496Z * [new branch] gh/mikaylagawarecki/377/head -> origin/gh/mikaylagawarecki/377/head 2025-12-04T10:32:19.4022636Z * [new branch] gh/mikaylagawarecki/377/orig -> origin/gh/mikaylagawarecki/377/orig 2025-12-04T10:32:19.4022727Z * [new branch] gh/mikaylagawarecki/378/base -> origin/gh/mikaylagawarecki/378/base 2025-12-04T10:32:19.4022822Z * [new branch] gh/mikaylagawarecki/378/head -> origin/gh/mikaylagawarecki/378/head 2025-12-04T10:32:19.4022915Z * [new branch] gh/mikaylagawarecki/378/orig -> origin/gh/mikaylagawarecki/378/orig 2025-12-04T10:32:19.4023007Z * [new branch] gh/mikaylagawarecki/379/base -> origin/gh/mikaylagawarecki/379/base 2025-12-04T10:32:19.4023100Z * [new branch] gh/mikaylagawarecki/379/head -> origin/gh/mikaylagawarecki/379/head 2025-12-04T10:32:19.4023191Z * [new branch] gh/mikaylagawarecki/379/orig -> origin/gh/mikaylagawarecki/379/orig 2025-12-04T10:32:19.4023282Z * [new branch] gh/mikaylagawarecki/380/base -> origin/gh/mikaylagawarecki/380/base 2025-12-04T10:32:19.4023377Z * [new branch] gh/mikaylagawarecki/380/head -> origin/gh/mikaylagawarecki/380/head 2025-12-04T10:32:19.4023468Z * [new branch] gh/mikaylagawarecki/380/orig -> origin/gh/mikaylagawarecki/380/orig 2025-12-04T10:32:19.4023558Z * [new branch] gh/mikaylagawarecki/381/base -> origin/gh/mikaylagawarecki/381/base 2025-12-04T10:32:19.4023684Z * [new branch] gh/mikaylagawarecki/381/head -> origin/gh/mikaylagawarecki/381/head 2025-12-04T10:32:19.4023777Z * [new branch] gh/mikaylagawarecki/381/orig -> origin/gh/mikaylagawarecki/381/orig 2025-12-04T10:32:19.4023868Z * [new branch] gh/mikaylagawarecki/382/base -> origin/gh/mikaylagawarecki/382/base 2025-12-04T10:32:19.4023962Z * [new branch] gh/mikaylagawarecki/382/head -> origin/gh/mikaylagawarecki/382/head 2025-12-04T10:32:19.4024054Z * [new branch] gh/mikaylagawarecki/382/orig -> origin/gh/mikaylagawarecki/382/orig 2025-12-04T10:32:19.4024148Z * [new branch] gh/mikaylagawarecki/383/base -> origin/gh/mikaylagawarecki/383/base 2025-12-04T10:32:19.4024239Z * [new branch] gh/mikaylagawarecki/383/head -> origin/gh/mikaylagawarecki/383/head 2025-12-04T10:32:19.4024330Z * [new branch] gh/mikaylagawarecki/383/orig -> origin/gh/mikaylagawarecki/383/orig 2025-12-04T10:32:19.4024424Z * [new branch] gh/mikaylagawarecki/384/base -> origin/gh/mikaylagawarecki/384/base 2025-12-04T10:32:19.4024516Z * [new branch] gh/mikaylagawarecki/384/head -> origin/gh/mikaylagawarecki/384/head 2025-12-04T10:32:19.4024608Z * [new branch] gh/mikaylagawarecki/384/orig -> origin/gh/mikaylagawarecki/384/orig 2025-12-04T10:32:19.4024704Z * [new branch] gh/mikaylagawarecki/385/base -> origin/gh/mikaylagawarecki/385/base 2025-12-04T10:32:19.4024795Z * [new branch] gh/mikaylagawarecki/385/head -> origin/gh/mikaylagawarecki/385/head 2025-12-04T10:32:19.4024886Z * [new branch] gh/mikaylagawarecki/385/orig -> origin/gh/mikaylagawarecki/385/orig 2025-12-04T10:32:19.4024979Z * [new branch] gh/mikaylagawarecki/386/base -> origin/gh/mikaylagawarecki/386/base 2025-12-04T10:32:19.4025070Z * [new branch] gh/mikaylagawarecki/386/head -> origin/gh/mikaylagawarecki/386/head 2025-12-04T10:32:19.4025162Z * [new branch] gh/mikaylagawarecki/386/orig -> origin/gh/mikaylagawarecki/386/orig 2025-12-04T10:32:19.4025255Z * [new branch] gh/mikaylagawarecki/387/base -> origin/gh/mikaylagawarecki/387/base 2025-12-04T10:32:19.4025351Z * [new branch] gh/mikaylagawarecki/387/head -> origin/gh/mikaylagawarecki/387/head 2025-12-04T10:32:19.4025442Z * [new branch] gh/mikaylagawarecki/387/orig -> origin/gh/mikaylagawarecki/387/orig 2025-12-04T10:32:19.4025537Z * [new branch] gh/mikaylagawarecki/388/base -> origin/gh/mikaylagawarecki/388/base 2025-12-04T10:32:19.4025661Z * [new branch] gh/mikaylagawarecki/388/head -> origin/gh/mikaylagawarecki/388/head 2025-12-04T10:32:19.4025753Z * [new branch] gh/mikaylagawarecki/388/orig -> origin/gh/mikaylagawarecki/388/orig 2025-12-04T10:32:19.4025844Z * [new branch] gh/mikaylagawarecki/389/base -> origin/gh/mikaylagawarecki/389/base 2025-12-04T10:32:19.4025937Z * [new branch] gh/mikaylagawarecki/389/head -> origin/gh/mikaylagawarecki/389/head 2025-12-04T10:32:19.4026028Z * [new branch] gh/mikaylagawarecki/389/orig -> origin/gh/mikaylagawarecki/389/orig 2025-12-04T10:32:19.4026119Z * [new branch] gh/mikaylagawarecki/390/base -> origin/gh/mikaylagawarecki/390/base 2025-12-04T10:32:19.4026209Z * [new branch] gh/mikaylagawarecki/390/head -> origin/gh/mikaylagawarecki/390/head 2025-12-04T10:32:19.4026303Z * [new branch] gh/mikaylagawarecki/390/orig -> origin/gh/mikaylagawarecki/390/orig 2025-12-04T10:32:19.4026397Z * [new branch] gh/mikaylagawarecki/391/base -> origin/gh/mikaylagawarecki/391/base 2025-12-04T10:32:19.4026488Z * [new branch] gh/mikaylagawarecki/391/head -> origin/gh/mikaylagawarecki/391/head 2025-12-04T10:32:19.4026582Z * [new branch] gh/mikaylagawarecki/391/orig -> origin/gh/mikaylagawarecki/391/orig 2025-12-04T10:32:19.4026718Z * [new branch] gh/mikaylagawarecki/392/base -> origin/gh/mikaylagawarecki/392/base 2025-12-04T10:32:19.4026810Z * [new branch] gh/mikaylagawarecki/392/head -> origin/gh/mikaylagawarecki/392/head 2025-12-04T10:32:19.4026904Z * [new branch] gh/mikaylagawarecki/392/orig -> origin/gh/mikaylagawarecki/392/orig 2025-12-04T10:32:19.4026972Z * [new branch] gh/mlazos/41/base -> origin/gh/mlazos/41/base 2025-12-04T10:32:19.4027041Z * [new branch] gh/mlazos/41/head -> origin/gh/mlazos/41/head 2025-12-04T10:32:19.4027107Z * [new branch] gh/mlazos/41/orig -> origin/gh/mlazos/41/orig 2025-12-04T10:32:19.4027176Z * [new branch] gh/mlazos/42/base -> origin/gh/mlazos/42/base 2025-12-04T10:32:19.4027244Z * [new branch] gh/mlazos/42/head -> origin/gh/mlazos/42/head 2025-12-04T10:32:19.4027311Z * [new branch] gh/mlazos/42/orig -> origin/gh/mlazos/42/orig 2025-12-04T10:32:19.4027378Z * [new branch] gh/mlazos/43/base -> origin/gh/mlazos/43/base 2025-12-04T10:32:19.4027446Z * [new branch] gh/mlazos/43/head -> origin/gh/mlazos/43/head 2025-12-04T10:32:19.4027512Z * [new branch] gh/mlazos/43/orig -> origin/gh/mlazos/43/orig 2025-12-04T10:32:19.4027578Z * [new branch] gh/mlazos/44/base -> origin/gh/mlazos/44/base 2025-12-04T10:32:19.4027646Z * [new branch] gh/mlazos/44/head -> origin/gh/mlazos/44/head 2025-12-04T10:32:19.4027712Z * [new branch] gh/mlazos/44/orig -> origin/gh/mlazos/44/orig 2025-12-04T10:32:19.4027780Z * [new branch] gh/mlazos/47/base -> origin/gh/mlazos/47/base 2025-12-04T10:32:19.4027849Z * [new branch] gh/mlazos/47/head -> origin/gh/mlazos/47/head 2025-12-04T10:32:19.4027913Z * [new branch] gh/mlazos/47/orig -> origin/gh/mlazos/47/orig 2025-12-04T10:32:19.4027981Z * [new branch] gh/mlazos/48/base -> origin/gh/mlazos/48/base 2025-12-04T10:32:19.4028048Z * [new branch] gh/mlazos/48/head -> origin/gh/mlazos/48/head 2025-12-04T10:32:19.4028114Z * [new branch] gh/mlazos/48/orig -> origin/gh/mlazos/48/orig 2025-12-04T10:32:19.4028180Z * [new branch] gh/mlazos/49/base -> origin/gh/mlazos/49/base 2025-12-04T10:32:19.4028247Z * [new branch] gh/mlazos/49/head -> origin/gh/mlazos/49/head 2025-12-04T10:32:19.4028311Z * [new branch] gh/mlazos/49/orig -> origin/gh/mlazos/49/orig 2025-12-04T10:32:19.4028404Z * [new branch] gh/mlazos/50/base -> origin/gh/mlazos/50/base 2025-12-04T10:32:19.4028472Z * [new branch] gh/mlazos/50/head -> origin/gh/mlazos/50/head 2025-12-04T10:32:19.4028538Z * [new branch] gh/mlazos/50/orig -> origin/gh/mlazos/50/orig 2025-12-04T10:32:19.4028605Z * [new branch] gh/mlazos/51/base -> origin/gh/mlazos/51/base 2025-12-04T10:32:19.4028673Z * [new branch] gh/mlazos/51/head -> origin/gh/mlazos/51/head 2025-12-04T10:32:19.4028739Z * [new branch] gh/mlazos/51/orig -> origin/gh/mlazos/51/orig 2025-12-04T10:32:19.4028806Z * [new branch] gh/mlazos/52/base -> origin/gh/mlazos/52/base 2025-12-04T10:32:19.4028872Z * [new branch] gh/mlazos/52/head -> origin/gh/mlazos/52/head 2025-12-04T10:32:19.4028938Z * [new branch] gh/mlazos/52/orig -> origin/gh/mlazos/52/orig 2025-12-04T10:32:19.4029009Z * [new branch] gh/mlazos/53/base -> origin/gh/mlazos/53/base 2025-12-04T10:32:19.4029074Z * [new branch] gh/mlazos/53/head -> origin/gh/mlazos/53/head 2025-12-04T10:32:19.4029140Z * [new branch] gh/mlazos/53/orig -> origin/gh/mlazos/53/orig 2025-12-04T10:32:19.4029209Z * [new branch] gh/mlazos/54/base -> origin/gh/mlazos/54/base 2025-12-04T10:32:19.4029303Z * [new branch] gh/mlazos/54/head -> origin/gh/mlazos/54/head 2025-12-04T10:32:19.4029367Z * [new branch] gh/mlazos/54/orig -> origin/gh/mlazos/54/orig 2025-12-04T10:32:19.4029435Z * [new branch] gh/mlazos/55/base -> origin/gh/mlazos/55/base 2025-12-04T10:32:19.4029521Z * [new branch] gh/mlazos/55/head -> origin/gh/mlazos/55/head 2025-12-04T10:32:19.4029619Z * [new branch] gh/mlazos/55/orig -> origin/gh/mlazos/55/orig 2025-12-04T10:32:19.4029691Z * [new branch] gh/mlazos/56/base -> origin/gh/mlazos/56/base 2025-12-04T10:32:19.4029757Z * [new branch] gh/mlazos/56/head -> origin/gh/mlazos/56/head 2025-12-04T10:32:19.4029821Z * [new branch] gh/mlazos/56/orig -> origin/gh/mlazos/56/orig 2025-12-04T10:32:19.4029889Z * [new branch] gh/mlazos/57/base -> origin/gh/mlazos/57/base 2025-12-04T10:32:19.4029956Z * [new branch] gh/mlazos/57/head -> origin/gh/mlazos/57/head 2025-12-04T10:32:19.4030022Z * [new branch] gh/mlazos/57/orig -> origin/gh/mlazos/57/orig 2025-12-04T10:32:19.4030092Z * [new branch] gh/mlazos/58/base -> origin/gh/mlazos/58/base 2025-12-04T10:32:19.4030157Z * [new branch] gh/mlazos/58/head -> origin/gh/mlazos/58/head 2025-12-04T10:32:19.4030223Z * [new branch] gh/mlazos/58/orig -> origin/gh/mlazos/58/orig 2025-12-04T10:32:19.4030292Z * [new branch] gh/mlazos/59/base -> origin/gh/mlazos/59/base 2025-12-04T10:32:19.4030358Z * [new branch] gh/mlazos/59/head -> origin/gh/mlazos/59/head 2025-12-04T10:32:19.4030426Z * [new branch] gh/mlazos/59/orig -> origin/gh/mlazos/59/orig 2025-12-04T10:32:19.4030492Z * [new branch] gh/mlazos/60/base -> origin/gh/mlazos/60/base 2025-12-04T10:32:19.4030559Z * [new branch] gh/mlazos/60/head -> origin/gh/mlazos/60/head 2025-12-04T10:32:19.4030625Z * [new branch] gh/mlazos/60/orig -> origin/gh/mlazos/60/orig 2025-12-04T10:32:19.4030691Z * [new branch] gh/mlazos/61/base -> origin/gh/mlazos/61/base 2025-12-04T10:32:19.4030756Z * [new branch] gh/mlazos/61/head -> origin/gh/mlazos/61/head 2025-12-04T10:32:19.4030824Z * [new branch] gh/mlazos/61/orig -> origin/gh/mlazos/61/orig 2025-12-04T10:32:19.4030936Z * [new branch] gh/mlazos/62/base -> origin/gh/mlazos/62/base 2025-12-04T10:32:19.4031003Z * [new branch] gh/mlazos/62/head -> origin/gh/mlazos/62/head 2025-12-04T10:32:19.4031071Z * [new branch] gh/mlazos/62/orig -> origin/gh/mlazos/62/orig 2025-12-04T10:32:19.4031136Z * [new branch] gh/mlazos/63/base -> origin/gh/mlazos/63/base 2025-12-04T10:32:19.4031204Z * [new branch] gh/mlazos/63/head -> origin/gh/mlazos/63/head 2025-12-04T10:32:19.4031272Z * [new branch] gh/mlazos/63/orig -> origin/gh/mlazos/63/orig 2025-12-04T10:32:19.4031338Z * [new branch] gh/mlazos/64/base -> origin/gh/mlazos/64/base 2025-12-04T10:32:19.4031404Z * [new branch] gh/mlazos/64/head -> origin/gh/mlazos/64/head 2025-12-04T10:32:19.4031473Z * [new branch] gh/mlazos/64/orig -> origin/gh/mlazos/64/orig 2025-12-04T10:32:19.4031538Z * [new branch] gh/mlazos/65/base -> origin/gh/mlazos/65/base 2025-12-04T10:32:19.4031606Z * [new branch] gh/mlazos/65/head -> origin/gh/mlazos/65/head 2025-12-04T10:32:19.4031673Z * [new branch] gh/mlazos/65/orig -> origin/gh/mlazos/65/orig 2025-12-04T10:32:19.4031739Z * [new branch] gh/mlazos/66/base -> origin/gh/mlazos/66/base 2025-12-04T10:32:19.4031852Z * [new branch] gh/mlazos/66/head -> origin/gh/mlazos/66/head 2025-12-04T10:32:19.4031921Z * [new branch] gh/mlazos/66/orig -> origin/gh/mlazos/66/orig 2025-12-04T10:32:19.4031987Z * [new branch] gh/mlazos/67/base -> origin/gh/mlazos/67/base 2025-12-04T10:32:19.4032055Z * [new branch] gh/mlazos/67/head -> origin/gh/mlazos/67/head 2025-12-04T10:32:19.4032120Z * [new branch] gh/mlazos/67/orig -> origin/gh/mlazos/67/orig 2025-12-04T10:32:19.4032186Z * [new branch] gh/mlazos/68/base -> origin/gh/mlazos/68/base 2025-12-04T10:32:19.4032254Z * [new branch] gh/mlazos/68/head -> origin/gh/mlazos/68/head 2025-12-04T10:32:19.4032321Z * [new branch] gh/mlazos/68/orig -> origin/gh/mlazos/68/orig 2025-12-04T10:32:19.4032388Z * [new branch] gh/mlazos/69/base -> origin/gh/mlazos/69/base 2025-12-04T10:32:19.4032457Z * [new branch] gh/mlazos/69/head -> origin/gh/mlazos/69/head 2025-12-04T10:32:19.4032522Z * [new branch] gh/mlazos/69/orig -> origin/gh/mlazos/69/orig 2025-12-04T10:32:19.4032588Z * [new branch] gh/mlazos/70/base -> origin/gh/mlazos/70/base 2025-12-04T10:32:19.4032655Z * [new branch] gh/mlazos/70/head -> origin/gh/mlazos/70/head 2025-12-04T10:32:19.4032723Z * [new branch] gh/mlazos/70/orig -> origin/gh/mlazos/70/orig 2025-12-04T10:32:19.4032789Z * [new branch] gh/mlazos/71/base -> origin/gh/mlazos/71/base 2025-12-04T10:32:19.4032856Z * [new branch] gh/mlazos/71/head -> origin/gh/mlazos/71/head 2025-12-04T10:32:19.4032922Z * [new branch] gh/mlazos/71/orig -> origin/gh/mlazos/71/orig 2025-12-04T10:32:19.4032988Z * [new branch] gh/mlazos/72/base -> origin/gh/mlazos/72/base 2025-12-04T10:32:19.4033058Z * [new branch] gh/mlazos/72/head -> origin/gh/mlazos/72/head 2025-12-04T10:32:19.4033124Z * [new branch] gh/mlazos/72/orig -> origin/gh/mlazos/72/orig 2025-12-04T10:32:19.4033190Z * [new branch] gh/mlazos/73/base -> origin/gh/mlazos/73/base 2025-12-04T10:32:19.4033258Z * [new branch] gh/mlazos/73/head -> origin/gh/mlazos/73/head 2025-12-04T10:32:19.4033324Z * [new branch] gh/mlazos/73/orig -> origin/gh/mlazos/73/orig 2025-12-04T10:32:19.4033390Z * [new branch] gh/mrmiywj/1/base -> origin/gh/mrmiywj/1/base 2025-12-04T10:32:19.4033494Z * [new branch] gh/mrmiywj/1/head -> origin/gh/mrmiywj/1/head 2025-12-04T10:32:19.4033568Z * [new branch] gh/muchulee8/73/base -> origin/gh/muchulee8/73/base 2025-12-04T10:32:19.4033644Z * [new branch] gh/muchulee8/73/head -> origin/gh/muchulee8/73/head 2025-12-04T10:32:19.4033717Z * [new branch] gh/muchulee8/73/orig -> origin/gh/muchulee8/73/orig 2025-12-04T10:32:19.4033802Z * [new branch] gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base 2025-12-04T10:32:19.4033886Z * [new branch] gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head 2025-12-04T10:32:19.4033967Z * [new branch] gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig 2025-12-04T10:32:19.4034048Z * [new branch] gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base 2025-12-04T10:32:19.4034130Z * [new branch] gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head 2025-12-04T10:32:19.4034210Z * [new branch] gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig 2025-12-04T10:32:19.4034290Z * [new branch] gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base 2025-12-04T10:32:19.4034372Z * [new branch] gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head 2025-12-04T10:32:19.4034481Z * [new branch] gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig 2025-12-04T10:32:19.4034560Z * [new branch] gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base 2025-12-04T10:32:19.4034640Z * [new branch] gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head 2025-12-04T10:32:19.4034720Z * [new branch] gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig 2025-12-04T10:32:19.4034799Z * [new branch] gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base 2025-12-04T10:32:19.4034885Z * [new branch] gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head 2025-12-04T10:32:19.4034963Z * [new branch] gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig 2025-12-04T10:32:19.4035043Z * [new branch] gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base 2025-12-04T10:32:19.4035123Z * [new branch] gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head 2025-12-04T10:32:19.4035202Z * [new branch] gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig 2025-12-04T10:32:19.4035282Z * [new branch] gh/naveenthangudu/7/base -> origin/gh/naveenthangudu/7/base 2025-12-04T10:32:19.4035361Z * [new branch] gh/naveenthangudu/7/head -> origin/gh/naveenthangudu/7/head 2025-12-04T10:32:19.4035440Z * [new branch] gh/naveenthangudu/7/orig -> origin/gh/naveenthangudu/7/orig 2025-12-04T10:32:19.4035520Z * [new branch] gh/naveenthangudu/8/base -> origin/gh/naveenthangudu/8/base 2025-12-04T10:32:19.4035600Z * [new branch] gh/naveenthangudu/8/head -> origin/gh/naveenthangudu/8/head 2025-12-04T10:32:19.4035679Z * [new branch] gh/naveenthangudu/8/orig -> origin/gh/naveenthangudu/8/orig 2025-12-04T10:32:19.4035760Z * [new branch] gh/naveenthangudu/9/base -> origin/gh/naveenthangudu/9/base 2025-12-04T10:32:19.4035841Z * [new branch] gh/naveenthangudu/9/head -> origin/gh/naveenthangudu/9/head 2025-12-04T10:32:19.4035921Z * [new branch] gh/naveenthangudu/9/orig -> origin/gh/naveenthangudu/9/orig 2025-12-04T10:32:19.4035996Z * [new branch] gh/nikitaved/1/base -> origin/gh/nikitaved/1/base 2025-12-04T10:32:19.4036069Z * [new branch] gh/nikitaved/1/head -> origin/gh/nikitaved/1/head 2025-12-04T10:32:19.4036141Z * [new branch] gh/nikitaved/1/orig -> origin/gh/nikitaved/1/orig 2025-12-04T10:32:19.4036215Z * [new branch] gh/nikitaved/10/base -> origin/gh/nikitaved/10/base 2025-12-04T10:32:19.4036318Z * [new branch] gh/nikitaved/10/head -> origin/gh/nikitaved/10/head 2025-12-04T10:32:19.4036390Z * [new branch] gh/nikitaved/10/orig -> origin/gh/nikitaved/10/orig 2025-12-04T10:32:19.4036462Z * [new branch] gh/nikitaved/11/base -> origin/gh/nikitaved/11/base 2025-12-04T10:32:19.4036534Z * [new branch] gh/nikitaved/11/head -> origin/gh/nikitaved/11/head 2025-12-04T10:32:19.4036606Z * [new branch] gh/nikitaved/11/orig -> origin/gh/nikitaved/11/orig 2025-12-04T10:32:19.4036680Z * [new branch] gh/nikitaved/12/base -> origin/gh/nikitaved/12/base 2025-12-04T10:32:19.4036751Z * [new branch] gh/nikitaved/12/head -> origin/gh/nikitaved/12/head 2025-12-04T10:32:19.4036823Z * [new branch] gh/nikitaved/12/orig -> origin/gh/nikitaved/12/orig 2025-12-04T10:32:19.4036896Z * [new branch] gh/nikitaved/13/base -> origin/gh/nikitaved/13/base 2025-12-04T10:32:19.4036966Z * [new branch] gh/nikitaved/13/head -> origin/gh/nikitaved/13/head 2025-12-04T10:32:19.4037038Z * [new branch] gh/nikitaved/13/orig -> origin/gh/nikitaved/13/orig 2025-12-04T10:32:19.4037109Z * [new branch] gh/nikitaved/14/base -> origin/gh/nikitaved/14/base 2025-12-04T10:32:19.4037222Z * [new branch] gh/nikitaved/14/head -> origin/gh/nikitaved/14/head 2025-12-04T10:32:19.4037295Z * [new branch] gh/nikitaved/14/orig -> origin/gh/nikitaved/14/orig 2025-12-04T10:32:19.4037366Z * [new branch] gh/nikitaved/15/base -> origin/gh/nikitaved/15/base 2025-12-04T10:32:19.4037437Z * [new branch] gh/nikitaved/15/head -> origin/gh/nikitaved/15/head 2025-12-04T10:32:19.4037509Z * [new branch] gh/nikitaved/15/orig -> origin/gh/nikitaved/15/orig 2025-12-04T10:32:19.4037581Z * [new branch] gh/nikitaved/16/base -> origin/gh/nikitaved/16/base 2025-12-04T10:32:19.4037653Z * [new branch] gh/nikitaved/16/head -> origin/gh/nikitaved/16/head 2025-12-04T10:32:19.4037726Z * [new branch] gh/nikitaved/16/orig -> origin/gh/nikitaved/16/orig 2025-12-04T10:32:19.4037797Z * [new branch] gh/nikitaved/2/base -> origin/gh/nikitaved/2/base 2025-12-04T10:32:19.4037870Z * [new branch] gh/nikitaved/2/head -> origin/gh/nikitaved/2/head 2025-12-04T10:32:19.4037943Z * [new branch] gh/nikitaved/2/orig -> origin/gh/nikitaved/2/orig 2025-12-04T10:32:19.4038013Z * [new branch] gh/nikitaved/4/base -> origin/gh/nikitaved/4/base 2025-12-04T10:32:19.4038083Z * [new branch] gh/nikitaved/4/head -> origin/gh/nikitaved/4/head 2025-12-04T10:32:19.4038155Z * [new branch] gh/nikitaved/4/orig -> origin/gh/nikitaved/4/orig 2025-12-04T10:32:19.4038226Z * [new branch] gh/nikitaved/5/base -> origin/gh/nikitaved/5/base 2025-12-04T10:32:19.4038296Z * [new branch] gh/nikitaved/5/head -> origin/gh/nikitaved/5/head 2025-12-04T10:32:19.4038369Z * [new branch] gh/nikitaved/5/orig -> origin/gh/nikitaved/5/orig 2025-12-04T10:32:19.4038439Z * [new branch] gh/nikitaved/6/base -> origin/gh/nikitaved/6/base 2025-12-04T10:32:19.4038513Z * [new branch] gh/nikitaved/6/head -> origin/gh/nikitaved/6/head 2025-12-04T10:32:19.4038583Z * [new branch] gh/nikitaved/6/orig -> origin/gh/nikitaved/6/orig 2025-12-04T10:32:19.4038653Z * [new branch] gh/nikitaved/8/base -> origin/gh/nikitaved/8/base 2025-12-04T10:32:19.4038725Z * [new branch] gh/nikitaved/8/head -> origin/gh/nikitaved/8/head 2025-12-04T10:32:19.4038795Z * [new branch] gh/nikitaved/8/orig -> origin/gh/nikitaved/8/orig 2025-12-04T10:32:19.4038896Z * [new branch] gh/nikitaved/9/base -> origin/gh/nikitaved/9/base 2025-12-04T10:32:19.4038968Z * [new branch] gh/nikitaved/9/head -> origin/gh/nikitaved/9/head 2025-12-04T10:32:19.4039039Z * [new branch] gh/nikitaved/9/orig -> origin/gh/nikitaved/9/orig 2025-12-04T10:32:19.4039105Z * [new branch] gh/oulgen/10/base -> origin/gh/oulgen/10/base 2025-12-04T10:32:19.4039175Z * [new branch] gh/oulgen/10/head -> origin/gh/oulgen/10/head 2025-12-04T10:32:19.4039242Z * [new branch] gh/oulgen/10/orig -> origin/gh/oulgen/10/orig 2025-12-04T10:32:19.4039307Z * [new branch] gh/oulgen/11/base -> origin/gh/oulgen/11/base 2025-12-04T10:32:19.4039377Z * [new branch] gh/oulgen/11/head -> origin/gh/oulgen/11/head 2025-12-04T10:32:19.4039443Z * [new branch] gh/oulgen/11/orig -> origin/gh/oulgen/11/orig 2025-12-04T10:32:19.4039508Z * [new branch] gh/oulgen/12/base -> origin/gh/oulgen/12/base 2025-12-04T10:32:19.4039611Z * [new branch] gh/oulgen/12/head -> origin/gh/oulgen/12/head 2025-12-04T10:32:19.4039678Z * [new branch] gh/oulgen/12/orig -> origin/gh/oulgen/12/orig 2025-12-04T10:32:19.4039744Z * [new branch] gh/oulgen/13/base -> origin/gh/oulgen/13/base 2025-12-04T10:32:19.4039859Z * [new branch] gh/oulgen/13/head -> origin/gh/oulgen/13/head 2025-12-04T10:32:19.4039926Z * [new branch] gh/oulgen/13/orig -> origin/gh/oulgen/13/orig 2025-12-04T10:32:19.4039993Z * [new branch] gh/oulgen/14/base -> origin/gh/oulgen/14/base 2025-12-04T10:32:19.4040061Z * [new branch] gh/oulgen/14/head -> origin/gh/oulgen/14/head 2025-12-04T10:32:19.4040127Z * [new branch] gh/oulgen/14/orig -> origin/gh/oulgen/14/orig 2025-12-04T10:32:19.4040193Z * [new branch] gh/oulgen/15/base -> origin/gh/oulgen/15/base 2025-12-04T10:32:19.4040259Z * [new branch] gh/oulgen/15/head -> origin/gh/oulgen/15/head 2025-12-04T10:32:19.4040324Z * [new branch] gh/oulgen/15/orig -> origin/gh/oulgen/15/orig 2025-12-04T10:32:19.4040392Z * [new branch] gh/oulgen/16/base -> origin/gh/oulgen/16/base 2025-12-04T10:32:19.4040458Z * [new branch] gh/oulgen/16/head -> origin/gh/oulgen/16/head 2025-12-04T10:32:19.4040522Z * [new branch] gh/oulgen/16/orig -> origin/gh/oulgen/16/orig 2025-12-04T10:32:19.4040589Z * [new branch] gh/oulgen/17/base -> origin/gh/oulgen/17/base 2025-12-04T10:32:19.4040654Z * [new branch] gh/oulgen/17/head -> origin/gh/oulgen/17/head 2025-12-04T10:32:19.4040720Z * [new branch] gh/oulgen/17/orig -> origin/gh/oulgen/17/orig 2025-12-04T10:32:19.4040787Z * [new branch] gh/oulgen/18/base -> origin/gh/oulgen/18/base 2025-12-04T10:32:19.4040854Z * [new branch] gh/oulgen/18/head -> origin/gh/oulgen/18/head 2025-12-04T10:32:19.4040918Z * [new branch] gh/oulgen/18/orig -> origin/gh/oulgen/18/orig 2025-12-04T10:32:19.4040986Z * [new branch] gh/oulgen/19/base -> origin/gh/oulgen/19/base 2025-12-04T10:32:19.4041053Z * [new branch] gh/oulgen/19/head -> origin/gh/oulgen/19/head 2025-12-04T10:32:19.4041119Z * [new branch] gh/oulgen/19/orig -> origin/gh/oulgen/19/orig 2025-12-04T10:32:19.4041185Z * [new branch] gh/oulgen/20/base -> origin/gh/oulgen/20/base 2025-12-04T10:32:19.4041249Z * [new branch] gh/oulgen/20/head -> origin/gh/oulgen/20/head 2025-12-04T10:32:19.4041313Z * [new branch] gh/oulgen/20/orig -> origin/gh/oulgen/20/orig 2025-12-04T10:32:19.4041382Z * [new branch] gh/oulgen/21/base -> origin/gh/oulgen/21/base 2025-12-04T10:32:19.4041491Z * [new branch] gh/oulgen/21/head -> origin/gh/oulgen/21/head 2025-12-04T10:32:19.4041557Z * [new branch] gh/oulgen/21/orig -> origin/gh/oulgen/21/orig 2025-12-04T10:32:19.4041625Z * [new branch] gh/oulgen/22/base -> origin/gh/oulgen/22/base 2025-12-04T10:32:19.4041693Z * [new branch] gh/oulgen/22/head -> origin/gh/oulgen/22/head 2025-12-04T10:32:19.4041761Z * [new branch] gh/oulgen/22/orig -> origin/gh/oulgen/22/orig 2025-12-04T10:32:19.4041826Z * [new branch] gh/oulgen/23/base -> origin/gh/oulgen/23/base 2025-12-04T10:32:19.4041892Z * [new branch] gh/oulgen/23/head -> origin/gh/oulgen/23/head 2025-12-04T10:32:19.4041960Z * [new branch] gh/oulgen/23/orig -> origin/gh/oulgen/23/orig 2025-12-04T10:32:19.4042025Z * [new branch] gh/oulgen/24/base -> origin/gh/oulgen/24/base 2025-12-04T10:32:19.4042094Z * [new branch] gh/oulgen/24/head -> origin/gh/oulgen/24/head 2025-12-04T10:32:19.4042162Z * [new branch] gh/oulgen/24/orig -> origin/gh/oulgen/24/orig 2025-12-04T10:32:19.4042228Z * [new branch] gh/oulgen/25/base -> origin/gh/oulgen/25/base 2025-12-04T10:32:19.4042323Z * [new branch] gh/oulgen/25/head -> origin/gh/oulgen/25/head 2025-12-04T10:32:19.4042390Z * [new branch] gh/oulgen/25/orig -> origin/gh/oulgen/25/orig 2025-12-04T10:32:19.4042455Z * [new branch] gh/oulgen/26/base -> origin/gh/oulgen/26/base 2025-12-04T10:32:19.4042520Z * [new branch] gh/oulgen/26/head -> origin/gh/oulgen/26/head 2025-12-04T10:32:19.4042587Z * [new branch] gh/oulgen/26/orig -> origin/gh/oulgen/26/orig 2025-12-04T10:32:19.4042654Z * [new branch] gh/oulgen/4/base -> origin/gh/oulgen/4/base 2025-12-04T10:32:19.4042722Z * [new branch] gh/oulgen/4/head -> origin/gh/oulgen/4/head 2025-12-04T10:32:19.4042789Z * [new branch] gh/oulgen/4/orig -> origin/gh/oulgen/4/orig 2025-12-04T10:32:19.4042856Z * [new branch] gh/oulgen/7/base -> origin/gh/oulgen/7/base 2025-12-04T10:32:19.4042921Z * [new branch] gh/oulgen/7/head -> origin/gh/oulgen/7/head 2025-12-04T10:32:19.4042990Z * [new branch] gh/oulgen/7/orig -> origin/gh/oulgen/7/orig 2025-12-04T10:32:19.4043054Z * [new branch] gh/oulgen/8/base -> origin/gh/oulgen/8/base 2025-12-04T10:32:19.4043119Z * [new branch] gh/oulgen/8/head -> origin/gh/oulgen/8/head 2025-12-04T10:32:19.4043187Z * [new branch] gh/oulgen/8/orig -> origin/gh/oulgen/8/orig 2025-12-04T10:32:19.4043252Z * [new branch] gh/oulgen/9/base -> origin/gh/oulgen/9/base 2025-12-04T10:32:19.4043318Z * [new branch] gh/oulgen/9/head -> origin/gh/oulgen/9/head 2025-12-04T10:32:19.4043385Z * [new branch] gh/oulgen/9/orig -> origin/gh/oulgen/9/orig 2025-12-04T10:32:19.4043489Z * [new branch] gh/patvig/mtia-serialization -> origin/gh/patvig/mtia-serialization 2025-12-04T10:32:19.4043558Z * [new branch] gh/pearu/108/base -> origin/gh/pearu/108/base 2025-12-04T10:32:19.4043626Z * [new branch] gh/pearu/108/head -> origin/gh/pearu/108/head 2025-12-04T10:32:19.4043692Z * [new branch] gh/pearu/108/orig -> origin/gh/pearu/108/orig 2025-12-04T10:32:19.4043761Z * [new branch] gh/pearu/109/base -> origin/gh/pearu/109/base 2025-12-04T10:32:19.4043825Z * [new branch] gh/pearu/109/head -> origin/gh/pearu/109/head 2025-12-04T10:32:19.4043890Z * [new branch] gh/pearu/109/orig -> origin/gh/pearu/109/orig 2025-12-04T10:32:19.4043988Z * [new branch] gh/pearu/110/base -> origin/gh/pearu/110/base 2025-12-04T10:32:19.4044054Z * [new branch] gh/pearu/110/head -> origin/gh/pearu/110/head 2025-12-04T10:32:19.4044120Z * [new branch] gh/pearu/110/orig -> origin/gh/pearu/110/orig 2025-12-04T10:32:19.4044185Z * [new branch] gh/pearu/111/base -> origin/gh/pearu/111/base 2025-12-04T10:32:19.4044252Z * [new branch] gh/pearu/111/head -> origin/gh/pearu/111/head 2025-12-04T10:32:19.4044317Z * [new branch] gh/pearu/111/orig -> origin/gh/pearu/111/orig 2025-12-04T10:32:19.4044385Z * [new branch] gh/pearu/112/base -> origin/gh/pearu/112/base 2025-12-04T10:32:19.4044450Z * [new branch] gh/pearu/112/head -> origin/gh/pearu/112/head 2025-12-04T10:32:19.4044515Z * [new branch] gh/pearu/112/orig -> origin/gh/pearu/112/orig 2025-12-04T10:32:19.4044586Z * [new branch] gh/pearu/115/base -> origin/gh/pearu/115/base 2025-12-04T10:32:19.4044651Z * [new branch] gh/pearu/115/head -> origin/gh/pearu/115/head 2025-12-04T10:32:19.4044716Z * [new branch] gh/pearu/115/orig -> origin/gh/pearu/115/orig 2025-12-04T10:32:19.4044782Z * [new branch] gh/pearu/116/base -> origin/gh/pearu/116/base 2025-12-04T10:32:19.4044875Z * [new branch] gh/pearu/116/head -> origin/gh/pearu/116/head 2025-12-04T10:32:19.4044942Z * [new branch] gh/pearu/116/orig -> origin/gh/pearu/116/orig 2025-12-04T10:32:19.4045008Z * [new branch] gh/pearu/117/base -> origin/gh/pearu/117/base 2025-12-04T10:32:19.4045074Z * [new branch] gh/pearu/117/head -> origin/gh/pearu/117/head 2025-12-04T10:32:19.4045141Z * [new branch] gh/pearu/117/orig -> origin/gh/pearu/117/orig 2025-12-04T10:32:19.4045206Z * [new branch] gh/pearu/118/base -> origin/gh/pearu/118/base 2025-12-04T10:32:19.4045274Z * [new branch] gh/pearu/118/head -> origin/gh/pearu/118/head 2025-12-04T10:32:19.4045341Z * [new branch] gh/pearu/118/orig -> origin/gh/pearu/118/orig 2025-12-04T10:32:19.4045407Z * [new branch] gh/pearu/119/base -> origin/gh/pearu/119/base 2025-12-04T10:32:19.4045475Z * [new branch] gh/pearu/119/head -> origin/gh/pearu/119/head 2025-12-04T10:32:19.4045543Z * [new branch] gh/pearu/119/orig -> origin/gh/pearu/119/orig 2025-12-04T10:32:19.4045609Z * [new branch] gh/pearu/139/base -> origin/gh/pearu/139/base 2025-12-04T10:32:19.4045674Z * [new branch] gh/pearu/139/head -> origin/gh/pearu/139/head 2025-12-04T10:32:19.4045742Z * [new branch] gh/pearu/139/orig -> origin/gh/pearu/139/orig 2025-12-04T10:32:19.4045809Z * [new branch] gh/pearu/140/base -> origin/gh/pearu/140/base 2025-12-04T10:32:19.4045877Z * [new branch] gh/pearu/140/head -> origin/gh/pearu/140/head 2025-12-04T10:32:19.4045944Z * [new branch] gh/pearu/140/orig -> origin/gh/pearu/140/orig 2025-12-04T10:32:19.4046008Z * [new branch] gh/pearu/142/base -> origin/gh/pearu/142/base 2025-12-04T10:32:19.4046075Z * [new branch] gh/pearu/142/head -> origin/gh/pearu/142/head 2025-12-04T10:32:19.4046142Z * [new branch] gh/pearu/142/orig -> origin/gh/pearu/142/orig 2025-12-04T10:32:19.4046207Z * [new branch] gh/pearu/143/base -> origin/gh/pearu/143/base 2025-12-04T10:32:19.4046275Z * [new branch] gh/pearu/143/head -> origin/gh/pearu/143/head 2025-12-04T10:32:19.4046342Z * [new branch] gh/pearu/143/orig -> origin/gh/pearu/143/orig 2025-12-04T10:32:19.4046408Z * [new branch] gh/pearu/147/base -> origin/gh/pearu/147/base 2025-12-04T10:32:19.4046510Z * [new branch] gh/pearu/147/head -> origin/gh/pearu/147/head 2025-12-04T10:32:19.4046577Z * [new branch] gh/pearu/147/orig -> origin/gh/pearu/147/orig 2025-12-04T10:32:19.4046642Z * [new branch] gh/pearu/149/base -> origin/gh/pearu/149/base 2025-12-04T10:32:19.4046710Z * [new branch] gh/pearu/149/head -> origin/gh/pearu/149/head 2025-12-04T10:32:19.4046775Z * [new branch] gh/pearu/149/orig -> origin/gh/pearu/149/orig 2025-12-04T10:32:19.4046842Z * [new branch] gh/pearu/150/base -> origin/gh/pearu/150/base 2025-12-04T10:32:19.4046909Z * [new branch] gh/pearu/150/head -> origin/gh/pearu/150/head 2025-12-04T10:32:19.4046974Z * [new branch] gh/pearu/150/orig -> origin/gh/pearu/150/orig 2025-12-04T10:32:19.4047039Z * [new branch] gh/pearu/151/base -> origin/gh/pearu/151/base 2025-12-04T10:32:19.4047109Z * [new branch] gh/pearu/151/head -> origin/gh/pearu/151/head 2025-12-04T10:32:19.4047174Z * [new branch] gh/pearu/151/orig -> origin/gh/pearu/151/orig 2025-12-04T10:32:19.4047240Z * [new branch] gh/pearu/152/base -> origin/gh/pearu/152/base 2025-12-04T10:32:19.4047308Z * [new branch] gh/pearu/152/head -> origin/gh/pearu/152/head 2025-12-04T10:32:19.4047405Z * [new branch] gh/pearu/152/orig -> origin/gh/pearu/152/orig 2025-12-04T10:32:19.4047471Z * [new branch] gh/pearu/153/base -> origin/gh/pearu/153/base 2025-12-04T10:32:19.4047539Z * [new branch] gh/pearu/153/head -> origin/gh/pearu/153/head 2025-12-04T10:32:19.4047604Z * [new branch] gh/pearu/153/orig -> origin/gh/pearu/153/orig 2025-12-04T10:32:19.4047670Z * [new branch] gh/pearu/154/base -> origin/gh/pearu/154/base 2025-12-04T10:32:19.4047739Z * [new branch] gh/pearu/154/head -> origin/gh/pearu/154/head 2025-12-04T10:32:19.4047804Z * [new branch] gh/pearu/154/orig -> origin/gh/pearu/154/orig 2025-12-04T10:32:19.4047870Z * [new branch] gh/pearu/155/base -> origin/gh/pearu/155/base 2025-12-04T10:32:19.4047938Z * [new branch] gh/pearu/155/head -> origin/gh/pearu/155/head 2025-12-04T10:32:19.4048006Z * [new branch] gh/pearu/155/orig -> origin/gh/pearu/155/orig 2025-12-04T10:32:19.4048072Z * [new branch] gh/pearu/156/base -> origin/gh/pearu/156/base 2025-12-04T10:32:19.4048141Z * [new branch] gh/pearu/156/head -> origin/gh/pearu/156/head 2025-12-04T10:32:19.4054013Z * [new branch] gh/pearu/156/orig -> origin/gh/pearu/156/orig 2025-12-04T10:32:19.4054096Z * [new branch] gh/pearu/56/base -> origin/gh/pearu/56/base 2025-12-04T10:32:19.4054170Z * [new branch] gh/pearu/56/head -> origin/gh/pearu/56/head 2025-12-04T10:32:19.4054236Z * [new branch] gh/pearu/56/orig -> origin/gh/pearu/56/orig 2025-12-04T10:32:19.4054301Z * [new branch] gh/pearu/97/base -> origin/gh/pearu/97/base 2025-12-04T10:32:19.4054367Z * [new branch] gh/pearu/97/head -> origin/gh/pearu/97/head 2025-12-04T10:32:19.4054437Z * [new branch] gh/pearu/97/orig -> origin/gh/pearu/97/orig 2025-12-04T10:32:19.4054510Z * [new branch] gh/pianpwk/21/base -> origin/gh/pianpwk/21/base 2025-12-04T10:32:19.4054583Z * [new branch] gh/pianpwk/21/head -> origin/gh/pianpwk/21/head 2025-12-04T10:32:19.4054656Z * [new branch] gh/pianpwk/28/base -> origin/gh/pianpwk/28/base 2025-12-04T10:32:19.4054724Z * [new branch] gh/pianpwk/28/head -> origin/gh/pianpwk/28/head 2025-12-04T10:32:19.4054854Z * [new branch] gh/pianpwk/28/orig -> origin/gh/pianpwk/28/orig 2025-12-04T10:32:19.4054923Z * [new branch] gh/pianpwk/29/base -> origin/gh/pianpwk/29/base 2025-12-04T10:32:19.4054991Z * [new branch] gh/pianpwk/29/head -> origin/gh/pianpwk/29/head 2025-12-04T10:32:19.4055060Z * [new branch] gh/pianpwk/29/orig -> origin/gh/pianpwk/29/orig 2025-12-04T10:32:19.4055130Z * [new branch] gh/pianpwk/30/base -> origin/gh/pianpwk/30/base 2025-12-04T10:32:19.4055198Z * [new branch] gh/pianpwk/30/head -> origin/gh/pianpwk/30/head 2025-12-04T10:32:19.4055268Z * [new branch] gh/pianpwk/30/orig -> origin/gh/pianpwk/30/orig 2025-12-04T10:32:19.4055337Z * [new branch] gh/pianpwk/31/base -> origin/gh/pianpwk/31/base 2025-12-04T10:32:19.4055406Z * [new branch] gh/pianpwk/31/head -> origin/gh/pianpwk/31/head 2025-12-04T10:32:19.4055475Z * [new branch] gh/pianpwk/31/orig -> origin/gh/pianpwk/31/orig 2025-12-04T10:32:19.4055546Z * [new branch] gh/pianpwk/32/base -> origin/gh/pianpwk/32/base 2025-12-04T10:32:19.4055615Z * [new branch] gh/pianpwk/32/head -> origin/gh/pianpwk/32/head 2025-12-04T10:32:19.4055684Z * [new branch] gh/pianpwk/32/orig -> origin/gh/pianpwk/32/orig 2025-12-04T10:32:19.4055806Z * [new branch] gh/pianpwk/33/base -> origin/gh/pianpwk/33/base 2025-12-04T10:32:19.4055876Z * [new branch] gh/pianpwk/33/head -> origin/gh/pianpwk/33/head 2025-12-04T10:32:19.4055945Z * [new branch] gh/pianpwk/33/orig -> origin/gh/pianpwk/33/orig 2025-12-04T10:32:19.4056015Z * [new branch] gh/pianpwk/34/base -> origin/gh/pianpwk/34/base 2025-12-04T10:32:19.4056085Z * [new branch] gh/pianpwk/34/head -> origin/gh/pianpwk/34/head 2025-12-04T10:32:19.4056153Z * [new branch] gh/pianpwk/34/orig -> origin/gh/pianpwk/34/orig 2025-12-04T10:32:19.4056227Z * [new branch] gh/pianpwk/35/base -> origin/gh/pianpwk/35/base 2025-12-04T10:32:19.4056297Z * [new branch] gh/pianpwk/35/head -> origin/gh/pianpwk/35/head 2025-12-04T10:32:19.4056366Z * [new branch] gh/pianpwk/35/orig -> origin/gh/pianpwk/35/orig 2025-12-04T10:32:19.4056435Z * [new branch] gh/rec/141/base -> origin/gh/rec/141/base 2025-12-04T10:32:19.4056503Z * [new branch] gh/rec/141/head -> origin/gh/rec/141/head 2025-12-04T10:32:19.4056570Z * [new branch] gh/rec/153/base -> origin/gh/rec/153/base 2025-12-04T10:32:19.4056637Z * [new branch] gh/rec/153/head -> origin/gh/rec/153/head 2025-12-04T10:32:19.4056701Z * [new branch] gh/rec/153/orig -> origin/gh/rec/153/orig 2025-12-04T10:32:19.4056763Z * [new branch] gh/rec/154/base -> origin/gh/rec/154/base 2025-12-04T10:32:19.4056827Z * [new branch] gh/rec/154/head -> origin/gh/rec/154/head 2025-12-04T10:32:19.4056890Z * [new branch] gh/rec/154/orig -> origin/gh/rec/154/orig 2025-12-04T10:32:19.4056954Z * [new branch] gh/rec/164/base -> origin/gh/rec/164/base 2025-12-04T10:32:19.4057018Z * [new branch] gh/rec/164/head -> origin/gh/rec/164/head 2025-12-04T10:32:19.4057084Z * [new branch] gh/rec/164/orig -> origin/gh/rec/164/orig 2025-12-04T10:32:19.4057148Z * [new branch] gh/rec/166/base -> origin/gh/rec/166/base 2025-12-04T10:32:19.4057211Z * [new branch] gh/rec/166/head -> origin/gh/rec/166/head 2025-12-04T10:32:19.4057275Z * [new branch] gh/rec/166/orig -> origin/gh/rec/166/orig 2025-12-04T10:32:19.4057338Z * [new branch] gh/rec/167/base -> origin/gh/rec/167/base 2025-12-04T10:32:19.4057434Z * [new branch] gh/rec/167/head -> origin/gh/rec/167/head 2025-12-04T10:32:19.4057498Z * [new branch] gh/rec/167/orig -> origin/gh/rec/167/orig 2025-12-04T10:32:19.4057560Z * [new branch] gh/rec/168/base -> origin/gh/rec/168/base 2025-12-04T10:32:19.4057627Z * [new branch] gh/rec/168/head -> origin/gh/rec/168/head 2025-12-04T10:32:19.4057691Z * [new branch] gh/rec/168/orig -> origin/gh/rec/168/orig 2025-12-04T10:32:19.4057754Z * [new branch] gh/rec/169/base -> origin/gh/rec/169/base 2025-12-04T10:32:19.4057822Z * [new branch] gh/rec/169/head -> origin/gh/rec/169/head 2025-12-04T10:32:19.4057884Z * [new branch] gh/rec/169/orig -> origin/gh/rec/169/orig 2025-12-04T10:32:19.4057947Z * [new branch] gh/rec/170/base -> origin/gh/rec/170/base 2025-12-04T10:32:19.4058012Z * [new branch] gh/rec/170/head -> origin/gh/rec/170/head 2025-12-04T10:32:19.4058076Z * [new branch] gh/rec/170/orig -> origin/gh/rec/170/orig 2025-12-04T10:32:19.4058137Z * [new branch] gh/rec/171/base -> origin/gh/rec/171/base 2025-12-04T10:32:19.4058200Z * [new branch] gh/rec/171/head -> origin/gh/rec/171/head 2025-12-04T10:32:19.4058292Z * [new branch] gh/rec/171/orig -> origin/gh/rec/171/orig 2025-12-04T10:32:19.4058356Z * [new branch] gh/rec/172/base -> origin/gh/rec/172/base 2025-12-04T10:32:19.4058418Z * [new branch] gh/rec/172/head -> origin/gh/rec/172/head 2025-12-04T10:32:19.4058480Z * [new branch] gh/rec/172/orig -> origin/gh/rec/172/orig 2025-12-04T10:32:19.4058541Z * [new branch] gh/rec/173/base -> origin/gh/rec/173/base 2025-12-04T10:32:19.4058607Z * [new branch] gh/rec/173/head -> origin/gh/rec/173/head 2025-12-04T10:32:19.4058671Z * [new branch] gh/rec/173/orig -> origin/gh/rec/173/orig 2025-12-04T10:32:19.4058733Z * [new branch] gh/rec/174/base -> origin/gh/rec/174/base 2025-12-04T10:32:19.4058797Z * [new branch] gh/rec/174/head -> origin/gh/rec/174/head 2025-12-04T10:32:19.4058860Z * [new branch] gh/rec/174/orig -> origin/gh/rec/174/orig 2025-12-04T10:32:19.4058922Z * [new branch] gh/rec/175/base -> origin/gh/rec/175/base 2025-12-04T10:32:19.4058985Z * [new branch] gh/rec/175/head -> origin/gh/rec/175/head 2025-12-04T10:32:19.4059049Z * [new branch] gh/rec/175/orig -> origin/gh/rec/175/orig 2025-12-04T10:32:19.4059111Z * [new branch] gh/rec/176/base -> origin/gh/rec/176/base 2025-12-04T10:32:19.4059173Z * [new branch] gh/rec/176/head -> origin/gh/rec/176/head 2025-12-04T10:32:19.4059237Z * [new branch] gh/rec/176/orig -> origin/gh/rec/176/orig 2025-12-04T10:32:19.4059300Z * [new branch] gh/rec/177/base -> origin/gh/rec/177/base 2025-12-04T10:32:19.4059362Z * [new branch] gh/rec/177/head -> origin/gh/rec/177/head 2025-12-04T10:32:19.4059423Z * [new branch] gh/rec/177/orig -> origin/gh/rec/177/orig 2025-12-04T10:32:19.4059515Z * [new branch] gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base 2025-12-04T10:32:19.4059635Z * [new branch] gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head 2025-12-04T10:32:19.4059718Z * [new branch] gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig 2025-12-04T10:32:19.4059801Z * [new branch] gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base 2025-12-04T10:32:19.4059880Z * [new branch] gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head 2025-12-04T10:32:19.4060006Z * [new branch] gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig 2025-12-04T10:32:19.4060089Z * [new branch] gh/robert-hardwick/5/base -> origin/gh/robert-hardwick/5/base 2025-12-04T10:32:19.4060169Z * [new branch] gh/robert-hardwick/5/head -> origin/gh/robert-hardwick/5/head 2025-12-04T10:32:19.4060250Z * [new branch] gh/robert-hardwick/5/orig -> origin/gh/robert-hardwick/5/orig 2025-12-04T10:32:19.4060335Z * [new branch] gh/robert-hardwick/6/base -> origin/gh/robert-hardwick/6/base 2025-12-04T10:32:19.4060415Z * [new branch] gh/robert-hardwick/6/head -> origin/gh/robert-hardwick/6/head 2025-12-04T10:32:19.4060496Z * [new branch] gh/robert-hardwick/6/orig -> origin/gh/robert-hardwick/6/orig 2025-12-04T10:32:19.4060578Z * [new branch] gh/robert-hardwick/7/base -> origin/gh/robert-hardwick/7/base 2025-12-04T10:32:19.4060660Z * [new branch] gh/robert-hardwick/7/head -> origin/gh/robert-hardwick/7/head 2025-12-04T10:32:19.4060742Z * [new branch] gh/robert-hardwick/7/orig -> origin/gh/robert-hardwick/7/orig 2025-12-04T10:32:19.4060824Z * [new branch] gh/robert-hardwick/8/base -> origin/gh/robert-hardwick/8/base 2025-12-04T10:32:19.4060958Z * [new branch] gh/robert-hardwick/8/head -> origin/gh/robert-hardwick/8/head 2025-12-04T10:32:19.4061040Z * [new branch] gh/robert-hardwick/8/orig -> origin/gh/robert-hardwick/8/orig 2025-12-04T10:32:19.4061120Z * [new branch] gh/robert-hardwick/9/base -> origin/gh/robert-hardwick/9/base 2025-12-04T10:32:19.4061199Z * [new branch] gh/robert-hardwick/9/head -> origin/gh/robert-hardwick/9/head 2025-12-04T10:32:19.4061280Z * [new branch] gh/robert-hardwick/9/orig -> origin/gh/robert-hardwick/9/orig 2025-12-04T10:32:19.4061351Z * [new branch] gh/rtimpe/1/base -> origin/gh/rtimpe/1/base 2025-12-04T10:32:19.4061420Z * [new branch] gh/rtimpe/1/head -> origin/gh/rtimpe/1/head 2025-12-04T10:32:19.4061488Z * [new branch] gh/rtimpe/2/base -> origin/gh/rtimpe/2/base 2025-12-04T10:32:19.4061553Z * [new branch] gh/rtimpe/2/head -> origin/gh/rtimpe/2/head 2025-12-04T10:32:19.4061623Z * [new branch] gh/rtimpe/22/base -> origin/gh/rtimpe/22/base 2025-12-04T10:32:19.4061690Z * [new branch] gh/rtimpe/22/head -> origin/gh/rtimpe/22/head 2025-12-04T10:32:19.4061756Z * [new branch] gh/rtimpe/22/orig -> origin/gh/rtimpe/22/orig 2025-12-04T10:32:19.4061821Z * [new branch] gh/rtimpe/23/base -> origin/gh/rtimpe/23/base 2025-12-04T10:32:19.4061889Z * [new branch] gh/rtimpe/23/head -> origin/gh/rtimpe/23/head 2025-12-04T10:32:19.4061955Z * [new branch] gh/rtimpe/23/orig -> origin/gh/rtimpe/23/orig 2025-12-04T10:32:19.4062022Z * [new branch] gh/rtimpe/24/base -> origin/gh/rtimpe/24/base 2025-12-04T10:32:19.4062090Z * [new branch] gh/rtimpe/24/head -> origin/gh/rtimpe/24/head 2025-12-04T10:32:19.4062156Z * [new branch] gh/rtimpe/24/orig -> origin/gh/rtimpe/24/orig 2025-12-04T10:32:19.4062223Z * [new branch] gh/rtimpe/25/base -> origin/gh/rtimpe/25/base 2025-12-04T10:32:19.4062291Z * [new branch] gh/rtimpe/25/head -> origin/gh/rtimpe/25/head 2025-12-04T10:32:19.4062357Z * [new branch] gh/rtimpe/25/orig -> origin/gh/rtimpe/25/orig 2025-12-04T10:32:19.4062424Z * [new branch] gh/rtimpe/26/base -> origin/gh/rtimpe/26/base 2025-12-04T10:32:19.4062490Z * [new branch] gh/rtimpe/26/head -> origin/gh/rtimpe/26/head 2025-12-04T10:32:19.4062555Z * [new branch] gh/rtimpe/26/orig -> origin/gh/rtimpe/26/orig 2025-12-04T10:32:19.4062661Z * [new branch] gh/rtimpe/27/base -> origin/gh/rtimpe/27/base 2025-12-04T10:32:19.4062727Z * [new branch] gh/rtimpe/27/head -> origin/gh/rtimpe/27/head 2025-12-04T10:32:19.4062793Z * [new branch] gh/rtimpe/27/orig -> origin/gh/rtimpe/27/orig 2025-12-04T10:32:19.4062858Z * [new branch] gh/rtimpe/28/base -> origin/gh/rtimpe/28/base 2025-12-04T10:32:19.4062924Z * [new branch] gh/rtimpe/28/head -> origin/gh/rtimpe/28/head 2025-12-04T10:32:19.4062990Z * [new branch] gh/rtimpe/28/orig -> origin/gh/rtimpe/28/orig 2025-12-04T10:32:19.4063057Z * [new branch] gh/rtimpe/29/base -> origin/gh/rtimpe/29/base 2025-12-04T10:32:19.4063122Z * [new branch] gh/rtimpe/29/head -> origin/gh/rtimpe/29/head 2025-12-04T10:32:19.4063187Z * [new branch] gh/rtimpe/29/orig -> origin/gh/rtimpe/29/orig 2025-12-04T10:32:19.4063257Z * [new branch] gh/rtimpe/3/base -> origin/gh/rtimpe/3/base 2025-12-04T10:32:19.4063322Z * [new branch] gh/rtimpe/3/head -> origin/gh/rtimpe/3/head 2025-12-04T10:32:19.4063387Z * [new branch] gh/rtimpe/30/base -> origin/gh/rtimpe/30/base 2025-12-04T10:32:19.4063454Z * [new branch] gh/rtimpe/30/head -> origin/gh/rtimpe/30/head 2025-12-04T10:32:19.4063546Z * [new branch] gh/rtimpe/30/orig -> origin/gh/rtimpe/30/orig 2025-12-04T10:32:19.4063612Z * [new branch] gh/rtimpe/31/base -> origin/gh/rtimpe/31/base 2025-12-04T10:32:19.4063678Z * [new branch] gh/rtimpe/31/head -> origin/gh/rtimpe/31/head 2025-12-04T10:32:19.4063743Z * [new branch] gh/rtimpe/31/orig -> origin/gh/rtimpe/31/orig 2025-12-04T10:32:19.4063807Z * [new branch] gh/rtimpe/32/base -> origin/gh/rtimpe/32/base 2025-12-04T10:32:19.4063874Z * [new branch] gh/rtimpe/32/head -> origin/gh/rtimpe/32/head 2025-12-04T10:32:19.4063939Z * [new branch] gh/rtimpe/32/orig -> origin/gh/rtimpe/32/orig 2025-12-04T10:32:19.4064005Z * [new branch] gh/rtimpe/33/base -> origin/gh/rtimpe/33/base 2025-12-04T10:32:19.4064070Z * [new branch] gh/rtimpe/33/head -> origin/gh/rtimpe/33/head 2025-12-04T10:32:19.4064137Z * [new branch] gh/rtimpe/33/orig -> origin/gh/rtimpe/33/orig 2025-12-04T10:32:19.4064204Z * [new branch] gh/rtimpe/34/base -> origin/gh/rtimpe/34/base 2025-12-04T10:32:19.4064269Z * [new branch] gh/rtimpe/34/head -> origin/gh/rtimpe/34/head 2025-12-04T10:32:19.4064334Z * [new branch] gh/rtimpe/34/orig -> origin/gh/rtimpe/34/orig 2025-12-04T10:32:19.4064400Z * [new branch] gh/rtimpe/35/base -> origin/gh/rtimpe/35/base 2025-12-04T10:32:19.4064467Z * [new branch] gh/rtimpe/35/head -> origin/gh/rtimpe/35/head 2025-12-04T10:32:19.4064533Z * [new branch] gh/rtimpe/35/orig -> origin/gh/rtimpe/35/orig 2025-12-04T10:32:19.4064600Z * [new branch] gh/rtimpe/4/base -> origin/gh/rtimpe/4/base 2025-12-04T10:32:19.4064667Z * [new branch] gh/rtimpe/4/head -> origin/gh/rtimpe/4/head 2025-12-04T10:32:19.4064750Z * [new branch] gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base 2025-12-04T10:32:19.4064832Z * [new branch] gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head 2025-12-04T10:32:19.4064908Z * [new branch] gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig 2025-12-04T10:32:19.4064984Z * [new branch] gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base 2025-12-04T10:32:19.4065063Z * [new branch] gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head 2025-12-04T10:32:19.4065170Z * [new branch] gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig 2025-12-04T10:32:19.4065244Z * [new branch] gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base 2025-12-04T10:32:19.4065321Z * [new branch] gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head 2025-12-04T10:32:19.4065397Z * [new branch] gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig 2025-12-04T10:32:19.4065473Z * [new branch] gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base 2025-12-04T10:32:19.4065551Z * [new branch] gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head 2025-12-04T10:32:19.4065625Z * [new branch] gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig 2025-12-04T10:32:19.4065702Z * [new branch] gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base 2025-12-04T10:32:19.4065776Z * [new branch] gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head 2025-12-04T10:32:19.4065854Z * [new branch] gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig 2025-12-04T10:32:19.4065929Z * [new branch] gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base 2025-12-04T10:32:19.4066004Z * [new branch] gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head 2025-12-04T10:32:19.4066105Z * [new branch] gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig 2025-12-04T10:32:19.4066182Z * [new branch] gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base 2025-12-04T10:32:19.4066258Z * [new branch] gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head 2025-12-04T10:32:19.4066332Z * [new branch] gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig 2025-12-04T10:32:19.4066412Z * [new branch] gh/seemethere/52/base -> origin/gh/seemethere/52/base 2025-12-04T10:32:19.4066485Z * [new branch] gh/seemethere/52/head -> origin/gh/seemethere/52/head 2025-12-04T10:32:19.4066561Z * [new branch] gh/seemethere/52/orig -> origin/gh/seemethere/52/orig 2025-12-04T10:32:19.4066636Z * [new branch] gh/seemethere/53/base -> origin/gh/seemethere/53/base 2025-12-04T10:32:19.4066711Z * [new branch] gh/seemethere/53/head -> origin/gh/seemethere/53/head 2025-12-04T10:32:19.4066786Z * [new branch] gh/seemethere/53/orig -> origin/gh/seemethere/53/orig 2025-12-04T10:32:19.4066860Z * [new branch] gh/seemethere/54/base -> origin/gh/seemethere/54/base 2025-12-04T10:32:19.4066931Z * [new branch] gh/seemethere/54/head -> origin/gh/seemethere/54/head 2025-12-04T10:32:19.4067003Z * [new branch] gh/seemethere/54/orig -> origin/gh/seemethere/54/orig 2025-12-04T10:32:19.4067076Z * [new branch] gh/seemethere/55/base -> origin/gh/seemethere/55/base 2025-12-04T10:32:19.4067148Z * [new branch] gh/seemethere/55/head -> origin/gh/seemethere/55/head 2025-12-04T10:32:19.4067225Z * [new branch] gh/seemethere/55/orig -> origin/gh/seemethere/55/orig 2025-12-04T10:32:19.4067298Z * [new branch] gh/seemethere/59/base -> origin/gh/seemethere/59/base 2025-12-04T10:32:19.4067369Z * [new branch] gh/seemethere/59/head -> origin/gh/seemethere/59/head 2025-12-04T10:32:19.4067445Z * [new branch] gh/seemethere/59/orig -> origin/gh/seemethere/59/orig 2025-12-04T10:32:19.4067516Z * [new branch] gh/seemethere/62/base -> origin/gh/seemethere/62/base 2025-12-04T10:32:19.4067588Z * [new branch] gh/seemethere/62/head -> origin/gh/seemethere/62/head 2025-12-04T10:32:19.4067661Z * [new branch] gh/seemethere/62/orig -> origin/gh/seemethere/62/orig 2025-12-04T10:32:19.4067733Z * [new branch] gh/seemethere/63/base -> origin/gh/seemethere/63/base 2025-12-04T10:32:19.4067838Z * [new branch] gh/seemethere/63/head -> origin/gh/seemethere/63/head 2025-12-04T10:32:19.4067911Z * [new branch] gh/seemethere/63/orig -> origin/gh/seemethere/63/orig 2025-12-04T10:32:19.4067983Z * [new branch] gh/seemethere/71/base -> origin/gh/seemethere/71/base 2025-12-04T10:32:19.4068054Z * [new branch] gh/seemethere/71/head -> origin/gh/seemethere/71/head 2025-12-04T10:32:19.4068129Z * [new branch] gh/seemethere/71/orig -> origin/gh/seemethere/71/orig 2025-12-04T10:32:19.4068201Z * [new branch] gh/seemethere/72/base -> origin/gh/seemethere/72/base 2025-12-04T10:32:19.4068272Z * [new branch] gh/seemethere/72/head -> origin/gh/seemethere/72/head 2025-12-04T10:32:19.4068346Z * [new branch] gh/seemethere/72/orig -> origin/gh/seemethere/72/orig 2025-12-04T10:32:19.4068419Z * [new branch] gh/seemethere/73/base -> origin/gh/seemethere/73/base 2025-12-04T10:32:19.4068492Z * [new branch] gh/seemethere/73/head -> origin/gh/seemethere/73/head 2025-12-04T10:32:19.4068565Z * [new branch] gh/seemethere/73/orig -> origin/gh/seemethere/73/orig 2025-12-04T10:32:19.4068637Z * [new branch] gh/seemethere/74/base -> origin/gh/seemethere/74/base 2025-12-04T10:32:19.4068712Z * [new branch] gh/seemethere/74/head -> origin/gh/seemethere/74/head 2025-12-04T10:32:19.4068811Z * [new branch] gh/seemethere/74/orig -> origin/gh/seemethere/74/orig 2025-12-04T10:32:19.4068883Z * [new branch] gh/seemethere/75/base -> origin/gh/seemethere/75/base 2025-12-04T10:32:19.4068957Z * [new branch] gh/seemethere/75/head -> origin/gh/seemethere/75/head 2025-12-04T10:32:19.4069029Z * [new branch] gh/seemethere/75/orig -> origin/gh/seemethere/75/orig 2025-12-04T10:32:19.4069101Z * [new branch] gh/seemethere/76/base -> origin/gh/seemethere/76/base 2025-12-04T10:32:19.4069175Z * [new branch] gh/seemethere/76/head -> origin/gh/seemethere/76/head 2025-12-04T10:32:19.4069249Z * [new branch] gh/seemethere/76/orig -> origin/gh/seemethere/76/orig 2025-12-04T10:32:19.4069325Z * [new branch] gh/shunting314/145/base -> origin/gh/shunting314/145/base 2025-12-04T10:32:19.4069402Z * [new branch] gh/shunting314/145/head -> origin/gh/shunting314/145/head 2025-12-04T10:32:19.4069477Z * [new branch] gh/shunting314/145/orig -> origin/gh/shunting314/145/orig 2025-12-04T10:32:19.4069552Z * [new branch] gh/shunting314/176/base -> origin/gh/shunting314/176/base 2025-12-04T10:32:19.4069676Z * [new branch] gh/shunting314/176/head -> origin/gh/shunting314/176/head 2025-12-04T10:32:19.4069752Z * [new branch] gh/shunting314/176/orig -> origin/gh/shunting314/176/orig 2025-12-04T10:32:19.4069825Z * [new branch] gh/shunting314/249/base -> origin/gh/shunting314/249/base 2025-12-04T10:32:19.4069901Z * [new branch] gh/shunting314/249/head -> origin/gh/shunting314/249/head 2025-12-04T10:32:19.4069975Z * [new branch] gh/shunting314/249/orig -> origin/gh/shunting314/249/orig 2025-12-04T10:32:19.4070049Z * [new branch] gh/shunting314/253/base -> origin/gh/shunting314/253/base 2025-12-04T10:32:19.4070125Z * [new branch] gh/shunting314/253/head -> origin/gh/shunting314/253/head 2025-12-04T10:32:19.4070199Z * [new branch] gh/shunting314/253/orig -> origin/gh/shunting314/253/orig 2025-12-04T10:32:19.4070273Z * [new branch] gh/shunting314/256/base -> origin/gh/shunting314/256/base 2025-12-04T10:32:19.4070350Z * [new branch] gh/shunting314/256/head -> origin/gh/shunting314/256/head 2025-12-04T10:32:19.4070424Z * [new branch] gh/shunting314/256/orig -> origin/gh/shunting314/256/orig 2025-12-04T10:32:19.4070550Z * [new branch] gh/shunting314/257/base -> origin/gh/shunting314/257/base 2025-12-04T10:32:19.4070623Z * [new branch] gh/shunting314/257/head -> origin/gh/shunting314/257/head 2025-12-04T10:32:19.4070698Z * [new branch] gh/shunting314/257/orig -> origin/gh/shunting314/257/orig 2025-12-04T10:32:19.4070776Z * [new branch] gh/shunting314/258/base -> origin/gh/shunting314/258/base 2025-12-04T10:32:19.4070852Z * [new branch] gh/shunting314/258/head -> origin/gh/shunting314/258/head 2025-12-04T10:32:19.4070927Z * [new branch] gh/shunting314/258/orig -> origin/gh/shunting314/258/orig 2025-12-04T10:32:19.4071002Z * [new branch] gh/shunting314/259/base -> origin/gh/shunting314/259/base 2025-12-04T10:32:19.4071075Z * [new branch] gh/shunting314/259/head -> origin/gh/shunting314/259/head 2025-12-04T10:32:19.4071147Z * [new branch] gh/shunting314/259/orig -> origin/gh/shunting314/259/orig 2025-12-04T10:32:19.4071223Z * [new branch] gh/shunting314/260/base -> origin/gh/shunting314/260/base 2025-12-04T10:32:19.4071295Z * [new branch] gh/shunting314/260/head -> origin/gh/shunting314/260/head 2025-12-04T10:32:19.4071368Z * [new branch] gh/shunting314/260/orig -> origin/gh/shunting314/260/orig 2025-12-04T10:32:19.4071489Z * [new branch] gh/shunting314/261/base -> origin/gh/shunting314/261/base 2025-12-04T10:32:19.4071563Z * [new branch] gh/shunting314/261/head -> origin/gh/shunting314/261/head 2025-12-04T10:32:19.4071637Z * [new branch] gh/shunting314/261/orig -> origin/gh/shunting314/261/orig 2025-12-04T10:32:19.4071713Z * [new branch] gh/shunting314/262/base -> origin/gh/shunting314/262/base 2025-12-04T10:32:19.4071787Z * [new branch] gh/shunting314/262/head -> origin/gh/shunting314/262/head 2025-12-04T10:32:19.4071860Z * [new branch] gh/shunting314/262/orig -> origin/gh/shunting314/262/orig 2025-12-04T10:32:19.4071935Z * [new branch] gh/shunting314/263/base -> origin/gh/shunting314/263/base 2025-12-04T10:32:19.4072009Z * [new branch] gh/shunting314/263/head -> origin/gh/shunting314/263/head 2025-12-04T10:32:19.4072082Z * [new branch] gh/shunting314/263/orig -> origin/gh/shunting314/263/orig 2025-12-04T10:32:19.4072157Z * [new branch] gh/shunting314/264/base -> origin/gh/shunting314/264/base 2025-12-04T10:32:19.4072231Z * [new branch] gh/shunting314/264/head -> origin/gh/shunting314/264/head 2025-12-04T10:32:19.4072305Z * [new branch] gh/shunting314/264/orig -> origin/gh/shunting314/264/orig 2025-12-04T10:32:19.4072378Z * [new branch] gh/shunting314/265/base -> origin/gh/shunting314/265/base 2025-12-04T10:32:19.4072451Z * [new branch] gh/shunting314/265/head -> origin/gh/shunting314/265/head 2025-12-04T10:32:19.4072526Z * [new branch] gh/shunting314/265/orig -> origin/gh/shunting314/265/orig 2025-12-04T10:32:19.4072599Z * [new branch] gh/shunting314/266/base -> origin/gh/shunting314/266/base 2025-12-04T10:32:19.4072673Z * [new branch] gh/shunting314/266/head -> origin/gh/shunting314/266/head 2025-12-04T10:32:19.4072747Z * [new branch] gh/shunting314/266/orig -> origin/gh/shunting314/266/orig 2025-12-04T10:32:19.4072820Z * [new branch] gh/shunting314/267/base -> origin/gh/shunting314/267/base 2025-12-04T10:32:19.4072893Z * [new branch] gh/shunting314/267/head -> origin/gh/shunting314/267/head 2025-12-04T10:32:19.4072968Z * [new branch] gh/shunting314/267/orig -> origin/gh/shunting314/267/orig 2025-12-04T10:32:19.4073041Z * [new branch] gh/shunting314/268/base -> origin/gh/shunting314/268/base 2025-12-04T10:32:19.4073114Z * [new branch] gh/shunting314/268/head -> origin/gh/shunting314/268/head 2025-12-04T10:32:19.4073225Z * [new branch] gh/shunting314/268/orig -> origin/gh/shunting314/268/orig 2025-12-04T10:32:19.4073299Z * [new branch] gh/shunting314/269/base -> origin/gh/shunting314/269/base 2025-12-04T10:32:19.4073372Z * [new branch] gh/shunting314/269/head -> origin/gh/shunting314/269/head 2025-12-04T10:32:19.4073448Z * [new branch] gh/shunting314/269/orig -> origin/gh/shunting314/269/orig 2025-12-04T10:32:19.4073522Z * [new branch] gh/silverguo/1/base -> origin/gh/silverguo/1/base 2025-12-04T10:32:19.4073598Z * [new branch] gh/silverguo/1/head -> origin/gh/silverguo/1/head 2025-12-04T10:32:19.4073669Z * [new branch] gh/silverguo/2/base -> origin/gh/silverguo/2/base 2025-12-04T10:32:19.4073740Z * [new branch] gh/silverguo/2/head -> origin/gh/silverguo/2/head 2025-12-04T10:32:19.4073812Z * [new branch] gh/silverguo/3/base -> origin/gh/silverguo/3/base 2025-12-04T10:32:19.4073883Z * [new branch] gh/silverguo/3/head -> origin/gh/silverguo/3/head 2025-12-04T10:32:19.4073953Z * [new branch] gh/silverguo/4/base -> origin/gh/silverguo/4/base 2025-12-04T10:32:19.4074023Z * [new branch] gh/silverguo/4/head -> origin/gh/silverguo/4/head 2025-12-04T10:32:19.4074128Z * [new branch] gh/slayton58/39/base -> origin/gh/slayton58/39/base 2025-12-04T10:32:19.4074199Z * [new branch] gh/slayton58/39/head -> origin/gh/slayton58/39/head 2025-12-04T10:32:19.4074270Z * [new branch] gh/slayton58/39/orig -> origin/gh/slayton58/39/orig 2025-12-04T10:32:19.4074339Z * [new branch] gh/slayton58/42/base -> origin/gh/slayton58/42/base 2025-12-04T10:32:19.4074408Z * [new branch] gh/slayton58/42/head -> origin/gh/slayton58/42/head 2025-12-04T10:32:19.4074477Z * [new branch] gh/slayton58/42/orig -> origin/gh/slayton58/42/orig 2025-12-04T10:32:19.4074547Z * [new branch] gh/slayton58/43/base -> origin/gh/slayton58/43/base 2025-12-04T10:32:19.4074617Z * [new branch] gh/slayton58/43/head -> origin/gh/slayton58/43/head 2025-12-04T10:32:19.4074687Z * [new branch] gh/slayton58/43/orig -> origin/gh/slayton58/43/orig 2025-12-04T10:32:19.4074757Z * [new branch] gh/slayton58/44/base -> origin/gh/slayton58/44/base 2025-12-04T10:32:19.4074828Z * [new branch] gh/slayton58/44/head -> origin/gh/slayton58/44/head 2025-12-04T10:32:19.4074900Z * [new branch] gh/slayton58/44/orig -> origin/gh/slayton58/44/orig 2025-12-04T10:32:19.4074970Z * [new branch] gh/slayton58/45/base -> origin/gh/slayton58/45/base 2025-12-04T10:32:19.4075040Z * [new branch] gh/slayton58/45/head -> origin/gh/slayton58/45/head 2025-12-04T10:32:19.4075112Z * [new branch] gh/slayton58/45/orig -> origin/gh/slayton58/45/orig 2025-12-04T10:32:19.4075183Z * [new branch] gh/slayton58/46/base -> origin/gh/slayton58/46/base 2025-12-04T10:32:19.4075255Z * [new branch] gh/slayton58/46/head -> origin/gh/slayton58/46/head 2025-12-04T10:32:19.4075325Z * [new branch] gh/slayton58/46/orig -> origin/gh/slayton58/46/orig 2025-12-04T10:32:19.4075397Z * [new branch] gh/slayton58/6/base -> origin/gh/slayton58/6/base 2025-12-04T10:32:19.4075468Z * [new branch] gh/slayton58/6/head -> origin/gh/slayton58/6/head 2025-12-04T10:32:19.4075538Z * [new branch] gh/slayton58/7/base -> origin/gh/slayton58/7/base 2025-12-04T10:32:19.4075606Z * [new branch] gh/slayton58/7/head -> origin/gh/slayton58/7/head 2025-12-04T10:32:19.4075680Z * [new branch] gh/soulitzer/269/base -> origin/gh/soulitzer/269/base 2025-12-04T10:32:19.4075753Z * [new branch] gh/soulitzer/269/head -> origin/gh/soulitzer/269/head 2025-12-04T10:32:19.4075856Z * [new branch] gh/soulitzer/269/orig -> origin/gh/soulitzer/269/orig 2025-12-04T10:32:19.4075933Z * [new branch] gh/soulitzer/276/base -> origin/gh/soulitzer/276/base 2025-12-04T10:32:19.4076004Z * [new branch] gh/soulitzer/276/head -> origin/gh/soulitzer/276/head 2025-12-04T10:32:19.4076078Z * [new branch] gh/soulitzer/276/orig -> origin/gh/soulitzer/276/orig 2025-12-04T10:32:19.4076152Z * [new branch] gh/soulitzer/287/base -> origin/gh/soulitzer/287/base 2025-12-04T10:32:19.4076224Z * [new branch] gh/soulitzer/287/head -> origin/gh/soulitzer/287/head 2025-12-04T10:32:19.4076297Z * [new branch] gh/soulitzer/287/orig -> origin/gh/soulitzer/287/orig 2025-12-04T10:32:19.4076374Z * [new branch] gh/soulitzer/296/base -> origin/gh/soulitzer/296/base 2025-12-04T10:32:19.4076449Z * [new branch] gh/soulitzer/296/head -> origin/gh/soulitzer/296/head 2025-12-04T10:32:19.4076522Z * [new branch] gh/soulitzer/296/orig -> origin/gh/soulitzer/296/orig 2025-12-04T10:32:19.4076597Z * [new branch] gh/soulitzer/299/base -> origin/gh/soulitzer/299/base 2025-12-04T10:32:19.4076669Z * [new branch] gh/soulitzer/299/head -> origin/gh/soulitzer/299/head 2025-12-04T10:32:19.4076767Z * [new branch] gh/soulitzer/299/orig -> origin/gh/soulitzer/299/orig 2025-12-04T10:32:19.4076838Z * [new branch] gh/soulitzer/300/base -> origin/gh/soulitzer/300/base 2025-12-04T10:32:19.4076910Z * [new branch] gh/soulitzer/300/head -> origin/gh/soulitzer/300/head 2025-12-04T10:32:19.4076983Z * [new branch] gh/soulitzer/300/orig -> origin/gh/soulitzer/300/orig 2025-12-04T10:32:19.4077054Z * [new branch] gh/soulitzer/301/base -> origin/gh/soulitzer/301/base 2025-12-04T10:32:19.4077128Z * [new branch] gh/soulitzer/301/head -> origin/gh/soulitzer/301/head 2025-12-04T10:32:19.4077203Z * [new branch] gh/soulitzer/301/orig -> origin/gh/soulitzer/301/orig 2025-12-04T10:32:19.4077275Z * [new branch] gh/soulitzer/313/base -> origin/gh/soulitzer/313/base 2025-12-04T10:32:19.4077346Z * [new branch] gh/soulitzer/313/head -> origin/gh/soulitzer/313/head 2025-12-04T10:32:19.4077420Z * [new branch] gh/soulitzer/313/orig -> origin/gh/soulitzer/313/orig 2025-12-04T10:32:19.4077493Z * [new branch] gh/soulitzer/319/base -> origin/gh/soulitzer/319/base 2025-12-04T10:32:19.4077564Z * [new branch] gh/soulitzer/319/head -> origin/gh/soulitzer/319/head 2025-12-04T10:32:19.4077637Z * [new branch] gh/soulitzer/319/orig -> origin/gh/soulitzer/319/orig 2025-12-04T10:32:19.4077709Z * [new branch] gh/soulitzer/320/base -> origin/gh/soulitzer/320/base 2025-12-04T10:32:19.4077782Z * [new branch] gh/soulitzer/320/head -> origin/gh/soulitzer/320/head 2025-12-04T10:32:19.4077855Z * [new branch] gh/soulitzer/320/orig -> origin/gh/soulitzer/320/orig 2025-12-04T10:32:19.4077926Z * [new branch] gh/soulitzer/336/base -> origin/gh/soulitzer/336/base 2025-12-04T10:32:19.4077997Z * [new branch] gh/soulitzer/336/head -> origin/gh/soulitzer/336/head 2025-12-04T10:32:19.4078070Z * [new branch] gh/soulitzer/336/orig -> origin/gh/soulitzer/336/orig 2025-12-04T10:32:19.4078142Z * [new branch] gh/soulitzer/347/base -> origin/gh/soulitzer/347/base 2025-12-04T10:32:19.4078214Z * [new branch] gh/soulitzer/347/head -> origin/gh/soulitzer/347/head 2025-12-04T10:32:19.4078289Z * [new branch] gh/soulitzer/347/orig -> origin/gh/soulitzer/347/orig 2025-12-04T10:32:19.4078361Z * [new branch] gh/soulitzer/349/base -> origin/gh/soulitzer/349/base 2025-12-04T10:32:19.4078459Z * [new branch] gh/soulitzer/349/head -> origin/gh/soulitzer/349/head 2025-12-04T10:32:19.4078529Z * [new branch] gh/soulitzer/349/orig -> origin/gh/soulitzer/349/orig 2025-12-04T10:32:19.4078600Z * [new branch] gh/soulitzer/350/base -> origin/gh/soulitzer/350/base 2025-12-04T10:32:19.4078676Z * [new branch] gh/soulitzer/350/head -> origin/gh/soulitzer/350/head 2025-12-04T10:32:19.4078747Z * [new branch] gh/soulitzer/350/orig -> origin/gh/soulitzer/350/orig 2025-12-04T10:32:19.4078818Z * [new branch] gh/soulitzer/351/base -> origin/gh/soulitzer/351/base 2025-12-04T10:32:19.4078890Z * [new branch] gh/soulitzer/351/head -> origin/gh/soulitzer/351/head 2025-12-04T10:32:19.4078962Z * [new branch] gh/soulitzer/351/orig -> origin/gh/soulitzer/351/orig 2025-12-04T10:32:19.4079034Z * [new branch] gh/soulitzer/353/base -> origin/gh/soulitzer/353/base 2025-12-04T10:32:19.4079108Z * [new branch] gh/soulitzer/353/head -> origin/gh/soulitzer/353/head 2025-12-04T10:32:19.4079180Z * [new branch] gh/soulitzer/353/orig -> origin/gh/soulitzer/353/orig 2025-12-04T10:32:19.4079253Z * [new branch] gh/soulitzer/358/base -> origin/gh/soulitzer/358/base 2025-12-04T10:32:19.4079352Z * [new branch] gh/soulitzer/358/head -> origin/gh/soulitzer/358/head 2025-12-04T10:32:19.4079424Z * [new branch] gh/soulitzer/358/orig -> origin/gh/soulitzer/358/orig 2025-12-04T10:32:19.4079497Z * [new branch] gh/soulitzer/359/base -> origin/gh/soulitzer/359/base 2025-12-04T10:32:19.4079609Z * [new branch] gh/soulitzer/359/head -> origin/gh/soulitzer/359/head 2025-12-04T10:32:19.4079681Z * [new branch] gh/soulitzer/359/orig -> origin/gh/soulitzer/359/orig 2025-12-04T10:32:19.4079753Z * [new branch] gh/soulitzer/374/base -> origin/gh/soulitzer/374/base 2025-12-04T10:32:19.4079826Z * [new branch] gh/soulitzer/374/head -> origin/gh/soulitzer/374/head 2025-12-04T10:32:19.4079897Z * [new branch] gh/soulitzer/374/orig -> origin/gh/soulitzer/374/orig 2025-12-04T10:32:19.4079969Z * [new branch] gh/soulitzer/375/base -> origin/gh/soulitzer/375/base 2025-12-04T10:32:19.4080041Z * [new branch] gh/soulitzer/375/head -> origin/gh/soulitzer/375/head 2025-12-04T10:32:19.4080113Z * [new branch] gh/soulitzer/375/orig -> origin/gh/soulitzer/375/orig 2025-12-04T10:32:19.4080186Z * [new branch] gh/soulitzer/380/base -> origin/gh/soulitzer/380/base 2025-12-04T10:32:19.4080256Z * [new branch] gh/soulitzer/380/head -> origin/gh/soulitzer/380/head 2025-12-04T10:32:19.4080330Z * [new branch] gh/soulitzer/380/orig -> origin/gh/soulitzer/380/orig 2025-12-04T10:32:19.4080402Z * [new branch] gh/soulitzer/385/base -> origin/gh/soulitzer/385/base 2025-12-04T10:32:19.4080477Z * [new branch] gh/soulitzer/385/head -> origin/gh/soulitzer/385/head 2025-12-04T10:32:19.4080549Z * [new branch] gh/soulitzer/385/orig -> origin/gh/soulitzer/385/orig 2025-12-04T10:32:19.4080621Z * [new branch] gh/soulitzer/386/base -> origin/gh/soulitzer/386/base 2025-12-04T10:32:19.4080696Z * [new branch] gh/soulitzer/386/head -> origin/gh/soulitzer/386/head 2025-12-04T10:32:19.4080770Z * [new branch] gh/soulitzer/386/orig -> origin/gh/soulitzer/386/orig 2025-12-04T10:32:19.4080842Z * [new branch] gh/soulitzer/387/base -> origin/gh/soulitzer/387/base 2025-12-04T10:32:19.4080913Z * [new branch] gh/soulitzer/387/head -> origin/gh/soulitzer/387/head 2025-12-04T10:32:19.4080985Z * [new branch] gh/soulitzer/387/orig -> origin/gh/soulitzer/387/orig 2025-12-04T10:32:19.4081109Z * [new branch] gh/soulitzer/388/base -> origin/gh/soulitzer/388/base 2025-12-04T10:32:19.4081180Z * [new branch] gh/soulitzer/388/head -> origin/gh/soulitzer/388/head 2025-12-04T10:32:19.4081255Z * [new branch] gh/soulitzer/388/orig -> origin/gh/soulitzer/388/orig 2025-12-04T10:32:19.4081328Z * [new branch] gh/soulitzer/389/base -> origin/gh/soulitzer/389/base 2025-12-04T10:32:19.4081401Z * [new branch] gh/soulitzer/389/head -> origin/gh/soulitzer/389/head 2025-12-04T10:32:19.4081476Z * [new branch] gh/soulitzer/389/orig -> origin/gh/soulitzer/389/orig 2025-12-04T10:32:19.4081547Z * [new branch] gh/soulitzer/390/base -> origin/gh/soulitzer/390/base 2025-12-04T10:32:19.4081618Z * [new branch] gh/soulitzer/390/head -> origin/gh/soulitzer/390/head 2025-12-04T10:32:19.4081691Z * [new branch] gh/soulitzer/390/orig -> origin/gh/soulitzer/390/orig 2025-12-04T10:32:19.4081764Z * [new branch] gh/soulitzer/391/base -> origin/gh/soulitzer/391/base 2025-12-04T10:32:19.4081837Z * [new branch] gh/soulitzer/391/head -> origin/gh/soulitzer/391/head 2025-12-04T10:32:19.4081908Z * [new branch] gh/soulitzer/391/orig -> origin/gh/soulitzer/391/orig 2025-12-04T10:32:19.4081978Z * [new branch] gh/soulitzer/392/base -> origin/gh/soulitzer/392/base 2025-12-04T10:32:19.4082121Z * [new branch] gh/soulitzer/392/head -> origin/gh/soulitzer/392/head 2025-12-04T10:32:19.4082193Z * [new branch] gh/soulitzer/392/orig -> origin/gh/soulitzer/392/orig 2025-12-04T10:32:19.4082264Z * [new branch] gh/swolchok/728/next -> origin/gh/swolchok/728/next 2025-12-04T10:32:19.4082337Z * [new branch] gh/swolchok/819/base -> origin/gh/swolchok/819/base 2025-12-04T10:32:19.4082407Z * [new branch] gh/swolchok/819/head -> origin/gh/swolchok/819/head 2025-12-04T10:32:19.4082479Z * [new branch] gh/swolchok/819/orig -> origin/gh/swolchok/819/orig 2025-12-04T10:32:19.4082549Z * [new branch] gh/swolchok/824/base -> origin/gh/swolchok/824/base 2025-12-04T10:32:19.4082620Z * [new branch] gh/swolchok/824/head -> origin/gh/swolchok/824/head 2025-12-04T10:32:19.4082690Z * [new branch] gh/swolchok/824/orig -> origin/gh/swolchok/824/orig 2025-12-04T10:32:19.4082762Z * [new branch] gh/swolchok/829/base -> origin/gh/swolchok/829/base 2025-12-04T10:32:19.4082833Z * [new branch] gh/swolchok/829/head -> origin/gh/swolchok/829/head 2025-12-04T10:32:19.4082904Z * [new branch] gh/swolchok/829/orig -> origin/gh/swolchok/829/orig 2025-12-04T10:32:19.4082974Z * [new branch] gh/swolchok/839/base -> origin/gh/swolchok/839/base 2025-12-04T10:32:19.4083043Z * [new branch] gh/swolchok/839/head -> origin/gh/swolchok/839/head 2025-12-04T10:32:19.4083116Z * [new branch] gh/swolchok/839/orig -> origin/gh/swolchok/839/orig 2025-12-04T10:32:19.4083186Z * [new branch] gh/swolchok/841/base -> origin/gh/swolchok/841/base 2025-12-04T10:32:19.4083255Z * [new branch] gh/swolchok/841/head -> origin/gh/swolchok/841/head 2025-12-04T10:32:19.4083326Z * [new branch] gh/swolchok/841/orig -> origin/gh/swolchok/841/orig 2025-12-04T10:32:19.4083399Z * [new branch] gh/swolchok/842/base -> origin/gh/swolchok/842/base 2025-12-04T10:32:19.4083468Z * [new branch] gh/swolchok/842/head -> origin/gh/swolchok/842/head 2025-12-04T10:32:19.4083538Z * [new branch] gh/swolchok/842/orig -> origin/gh/swolchok/842/orig 2025-12-04T10:32:19.4083607Z * [new branch] gh/swolchok/845/base -> origin/gh/swolchok/845/base 2025-12-04T10:32:19.4083678Z * [new branch] gh/swolchok/845/head -> origin/gh/swolchok/845/head 2025-12-04T10:32:19.4083776Z * [new branch] gh/swolchok/845/orig -> origin/gh/swolchok/845/orig 2025-12-04T10:32:19.4083846Z * [new branch] gh/swolchok/848/base -> origin/gh/swolchok/848/base 2025-12-04T10:32:19.4083915Z * [new branch] gh/swolchok/848/head -> origin/gh/swolchok/848/head 2025-12-04T10:32:19.4083988Z * [new branch] gh/swolchok/848/orig -> origin/gh/swolchok/848/orig 2025-12-04T10:32:19.4084058Z * [new branch] gh/swolchok/856/base -> origin/gh/swolchok/856/base 2025-12-04T10:32:19.4084128Z * [new branch] gh/swolchok/856/head -> origin/gh/swolchok/856/head 2025-12-04T10:32:19.4084200Z * [new branch] gh/swolchok/856/orig -> origin/gh/swolchok/856/orig 2025-12-04T10:32:19.4084270Z * [new branch] gh/swolchok/860/base -> origin/gh/swolchok/860/base 2025-12-04T10:32:19.4084341Z * [new branch] gh/swolchok/860/head -> origin/gh/swolchok/860/head 2025-12-04T10:32:19.4084414Z * [new branch] gh/swolchok/860/orig -> origin/gh/swolchok/860/orig 2025-12-04T10:32:19.4084484Z * [new branch] gh/swolchok/861/base -> origin/gh/swolchok/861/base 2025-12-04T10:32:19.4084553Z * [new branch] gh/swolchok/861/head -> origin/gh/swolchok/861/head 2025-12-04T10:32:19.4084658Z * [new branch] gh/swolchok/861/orig -> origin/gh/swolchok/861/orig 2025-12-04T10:32:19.4084728Z * [new branch] gh/swolchok/862/base -> origin/gh/swolchok/862/base 2025-12-04T10:32:19.4084797Z * [new branch] gh/swolchok/862/head -> origin/gh/swolchok/862/head 2025-12-04T10:32:19.4084867Z * [new branch] gh/swolchok/862/orig -> origin/gh/swolchok/862/orig 2025-12-04T10:32:19.4084938Z * [new branch] gh/swolchok/863/base -> origin/gh/swolchok/863/base 2025-12-04T10:32:19.4085009Z * [new branch] gh/swolchok/863/head -> origin/gh/swolchok/863/head 2025-12-04T10:32:19.4085081Z * [new branch] gh/swolchok/863/orig -> origin/gh/swolchok/863/orig 2025-12-04T10:32:19.4085150Z * [new branch] gh/swolchok/864/base -> origin/gh/swolchok/864/base 2025-12-04T10:32:19.4085221Z * [new branch] gh/swolchok/864/head -> origin/gh/swolchok/864/head 2025-12-04T10:32:19.4085291Z * [new branch] gh/swolchok/864/orig -> origin/gh/swolchok/864/orig 2025-12-04T10:32:19.4085361Z * [new branch] gh/swolchok/865/base -> origin/gh/swolchok/865/base 2025-12-04T10:32:19.4085432Z * [new branch] gh/swolchok/865/head -> origin/gh/swolchok/865/head 2025-12-04T10:32:19.4085501Z * [new branch] gh/swolchok/865/orig -> origin/gh/swolchok/865/orig 2025-12-04T10:32:19.4085570Z * [new branch] gh/swolchok/866/base -> origin/gh/swolchok/866/base 2025-12-04T10:32:19.4085640Z * [new branch] gh/swolchok/866/head -> origin/gh/swolchok/866/head 2025-12-04T10:32:19.4085710Z * [new branch] gh/swolchok/866/orig -> origin/gh/swolchok/866/orig 2025-12-04T10:32:19.4085779Z * [new branch] gh/swolchok/867/base -> origin/gh/swolchok/867/base 2025-12-04T10:32:19.4085848Z * [new branch] gh/swolchok/867/head -> origin/gh/swolchok/867/head 2025-12-04T10:32:19.4085919Z * [new branch] gh/swolchok/867/orig -> origin/gh/swolchok/867/orig 2025-12-04T10:32:19.4085988Z * [new branch] gh/swolchok/868/base -> origin/gh/swolchok/868/base 2025-12-04T10:32:19.4086059Z * [new branch] gh/swolchok/868/head -> origin/gh/swolchok/868/head 2025-12-04T10:32:19.4086128Z * [new branch] gh/swolchok/868/orig -> origin/gh/swolchok/868/orig 2025-12-04T10:32:19.4086197Z * [new branch] gh/swolchok/869/base -> origin/gh/swolchok/869/base 2025-12-04T10:32:19.4086294Z * [new branch] gh/swolchok/869/head -> origin/gh/swolchok/869/head 2025-12-04T10:32:19.4086363Z * [new branch] gh/swolchok/869/orig -> origin/gh/swolchok/869/orig 2025-12-04T10:32:19.4086436Z * [new branch] gh/swolchok/870/base -> origin/gh/swolchok/870/base 2025-12-04T10:32:19.4086504Z * [new branch] gh/swolchok/870/head -> origin/gh/swolchok/870/head 2025-12-04T10:32:19.4086576Z * [new branch] gh/swolchok/870/orig -> origin/gh/swolchok/870/orig 2025-12-04T10:32:19.4086647Z * [new branch] gh/swolchok/871/base -> origin/gh/swolchok/871/base 2025-12-04T10:32:19.4086716Z * [new branch] gh/swolchok/871/head -> origin/gh/swolchok/871/head 2025-12-04T10:32:19.4086785Z * [new branch] gh/swolchok/871/orig -> origin/gh/swolchok/871/orig 2025-12-04T10:32:19.4086856Z * [new branch] gh/teja-rao/4/base -> origin/gh/teja-rao/4/base 2025-12-04T10:32:19.4086926Z * [new branch] gh/teja-rao/4/head -> origin/gh/teja-rao/4/head 2025-12-04T10:32:19.4086994Z * [new branch] gh/teja-rao/4/orig -> origin/gh/teja-rao/4/orig 2025-12-04T10:32:19.4087064Z * [new branch] gh/tianyu-l/2/base -> origin/gh/tianyu-l/2/base 2025-12-04T10:32:19.4087131Z * [new branch] gh/tianyu-l/2/head -> origin/gh/tianyu-l/2/head 2025-12-04T10:32:19.4087224Z * [new branch] gh/tianyu-l/2/orig -> origin/gh/tianyu-l/2/orig 2025-12-04T10:32:19.4087293Z * [new branch] gh/tianyu-l/3/base -> origin/gh/tianyu-l/3/base 2025-12-04T10:32:19.4087360Z * [new branch] gh/tianyu-l/3/orig -> origin/gh/tianyu-l/3/orig 2025-12-04T10:32:19.4087428Z * [new branch] gh/tianyu-l/4/base -> origin/gh/tianyu-l/4/base 2025-12-04T10:32:19.4087496Z * [new branch] gh/tianyu-l/4/head -> origin/gh/tianyu-l/4/head 2025-12-04T10:32:19.4087564Z * [new branch] gh/tianyu-l/4/orig -> origin/gh/tianyu-l/4/orig 2025-12-04T10:32:19.4087654Z * [new branch] gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base 2025-12-04T10:32:19.4087739Z * [new branch] gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head 2025-12-04T10:32:19.4087822Z * [new branch] gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig 2025-12-04T10:32:19.4087906Z * [new branch] gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base 2025-12-04T10:32:19.4087991Z * [new branch] gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head 2025-12-04T10:32:19.4088071Z * [new branch] gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig 2025-12-04T10:32:19.4088154Z * [new branch] gh/tugsbayasgalan/17/base -> origin/gh/tugsbayasgalan/17/base 2025-12-04T10:32:19.4088236Z * [new branch] gh/tugsbayasgalan/17/head -> origin/gh/tugsbayasgalan/17/head 2025-12-04T10:32:19.4088318Z * [new branch] gh/tugsbayasgalan/17/orig -> origin/gh/tugsbayasgalan/17/orig 2025-12-04T10:32:19.4088401Z * [new branch] gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base 2025-12-04T10:32:19.4088483Z * [new branch] gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head 2025-12-04T10:32:19.4088562Z * [new branch] gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig 2025-12-04T10:32:19.4088645Z * [new branch] gh/tugsbayasgalan/28/base -> origin/gh/tugsbayasgalan/28/base 2025-12-04T10:32:19.4088726Z * [new branch] gh/tugsbayasgalan/28/head -> origin/gh/tugsbayasgalan/28/head 2025-12-04T10:32:19.4088808Z * [new branch] gh/tugsbayasgalan/28/orig -> origin/gh/tugsbayasgalan/28/orig 2025-12-04T10:32:19.4088893Z * [new branch] gh/tugsbayasgalan/32/base -> origin/gh/tugsbayasgalan/32/base 2025-12-04T10:32:19.4089003Z * [new branch] gh/tugsbayasgalan/32/head -> origin/gh/tugsbayasgalan/32/head 2025-12-04T10:32:19.4089084Z * [new branch] gh/tugsbayasgalan/32/orig -> origin/gh/tugsbayasgalan/32/orig 2025-12-04T10:32:19.4089167Z * [new branch] gh/tugsbayasgalan/35/base -> origin/gh/tugsbayasgalan/35/base 2025-12-04T10:32:19.4089248Z * [new branch] gh/tugsbayasgalan/35/head -> origin/gh/tugsbayasgalan/35/head 2025-12-04T10:32:19.4089328Z * [new branch] gh/tugsbayasgalan/35/orig -> origin/gh/tugsbayasgalan/35/orig 2025-12-04T10:32:19.4089413Z * [new branch] gh/tugsbayasgalan/36/base -> origin/gh/tugsbayasgalan/36/base 2025-12-04T10:32:19.4089494Z * [new branch] gh/tugsbayasgalan/36/head -> origin/gh/tugsbayasgalan/36/head 2025-12-04T10:32:19.4089609Z * [new branch] gh/tugsbayasgalan/36/orig -> origin/gh/tugsbayasgalan/36/orig 2025-12-04T10:32:19.4089691Z * [new branch] gh/tugsbayasgalan/37/base -> origin/gh/tugsbayasgalan/37/base 2025-12-04T10:32:19.4089774Z * [new branch] gh/tugsbayasgalan/37/head -> origin/gh/tugsbayasgalan/37/head 2025-12-04T10:32:19.4089857Z * [new branch] gh/tugsbayasgalan/37/orig -> origin/gh/tugsbayasgalan/37/orig 2025-12-04T10:32:19.4089936Z * [new branch] gh/tugsbayasgalan/43/base -> origin/gh/tugsbayasgalan/43/base 2025-12-04T10:32:19.4090068Z * [new branch] gh/tugsbayasgalan/43/head -> origin/gh/tugsbayasgalan/43/head 2025-12-04T10:32:19.4090151Z * [new branch] gh/tugsbayasgalan/43/orig -> origin/gh/tugsbayasgalan/43/orig 2025-12-04T10:32:19.4090231Z * [new branch] gh/tugsbayasgalan/48/base -> origin/gh/tugsbayasgalan/48/base 2025-12-04T10:32:19.4090312Z * [new branch] gh/tugsbayasgalan/48/head -> origin/gh/tugsbayasgalan/48/head 2025-12-04T10:32:19.4090393Z * [new branch] gh/tugsbayasgalan/48/orig -> origin/gh/tugsbayasgalan/48/orig 2025-12-04T10:32:19.4090476Z * [new branch] gh/tugsbayasgalan/51/base -> origin/gh/tugsbayasgalan/51/base 2025-12-04T10:32:19.4090556Z * [new branch] gh/tugsbayasgalan/51/head -> origin/gh/tugsbayasgalan/51/head 2025-12-04T10:32:19.4090641Z * [new branch] gh/tugsbayasgalan/51/orig -> origin/gh/tugsbayasgalan/51/orig 2025-12-04T10:32:19.4090724Z * [new branch] gh/tugsbayasgalan/52/base -> origin/gh/tugsbayasgalan/52/base 2025-12-04T10:32:19.4090805Z * [new branch] gh/tugsbayasgalan/52/head -> origin/gh/tugsbayasgalan/52/head 2025-12-04T10:32:19.4090886Z * [new branch] gh/tugsbayasgalan/52/orig -> origin/gh/tugsbayasgalan/52/orig 2025-12-04T10:32:19.4090967Z * [new branch] gh/tugsbayasgalan/53/base -> origin/gh/tugsbayasgalan/53/base 2025-12-04T10:32:19.4091047Z * [new branch] gh/tugsbayasgalan/53/head -> origin/gh/tugsbayasgalan/53/head 2025-12-04T10:32:19.4091130Z * [new branch] gh/tugsbayasgalan/53/orig -> origin/gh/tugsbayasgalan/53/orig 2025-12-04T10:32:19.4091215Z * [new branch] gh/tugsbayasgalan/55/base -> origin/gh/tugsbayasgalan/55/base 2025-12-04T10:32:19.4091299Z * [new branch] gh/tugsbayasgalan/55/head -> origin/gh/tugsbayasgalan/55/head 2025-12-04T10:32:19.4091381Z * [new branch] gh/tugsbayasgalan/55/orig -> origin/gh/tugsbayasgalan/55/orig 2025-12-04T10:32:19.4091463Z * [new branch] gh/tugsbayasgalan/59/base -> origin/gh/tugsbayasgalan/59/base 2025-12-04T10:32:19.4091546Z * [new branch] gh/tugsbayasgalan/59/head -> origin/gh/tugsbayasgalan/59/head 2025-12-04T10:32:19.4091629Z * [new branch] gh/tugsbayasgalan/59/orig -> origin/gh/tugsbayasgalan/59/orig 2025-12-04T10:32:19.4091710Z * [new branch] gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base 2025-12-04T10:32:19.4091791Z * [new branch] gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head 2025-12-04T10:32:19.4091922Z * [new branch] gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig 2025-12-04T10:32:19.4092004Z * [new branch] gh/tugsbayasgalan/60/base -> origin/gh/tugsbayasgalan/60/base 2025-12-04T10:32:19.4092085Z * [new branch] gh/tugsbayasgalan/60/head -> origin/gh/tugsbayasgalan/60/head 2025-12-04T10:32:19.4092167Z * [new branch] gh/tugsbayasgalan/60/orig -> origin/gh/tugsbayasgalan/60/orig 2025-12-04T10:32:19.4092248Z * [new branch] gh/tugsbayasgalan/61/base -> origin/gh/tugsbayasgalan/61/base 2025-12-04T10:32:19.4092332Z * [new branch] gh/tugsbayasgalan/61/head -> origin/gh/tugsbayasgalan/61/head 2025-12-04T10:32:19.4092413Z * [new branch] gh/tugsbayasgalan/61/orig -> origin/gh/tugsbayasgalan/61/orig 2025-12-04T10:32:19.4092494Z * [new branch] gh/tugsbayasgalan/63/base -> origin/gh/tugsbayasgalan/63/base 2025-12-04T10:32:19.4092579Z * [new branch] gh/tugsbayasgalan/63/head -> origin/gh/tugsbayasgalan/63/head 2025-12-04T10:32:19.4092659Z * [new branch] gh/tugsbayasgalan/63/orig -> origin/gh/tugsbayasgalan/63/orig 2025-12-04T10:32:19.4092740Z * [new branch] gh/tugsbayasgalan/67/base -> origin/gh/tugsbayasgalan/67/base 2025-12-04T10:32:19.4092824Z * [new branch] gh/tugsbayasgalan/67/head -> origin/gh/tugsbayasgalan/67/head 2025-12-04T10:32:19.4092929Z * [new branch] gh/tugsbayasgalan/67/orig -> origin/gh/tugsbayasgalan/67/orig 2025-12-04T10:32:19.4093010Z * [new branch] gh/tugsbayasgalan/68/base -> origin/gh/tugsbayasgalan/68/base 2025-12-04T10:32:19.4093090Z * [new branch] gh/tugsbayasgalan/68/head -> origin/gh/tugsbayasgalan/68/head 2025-12-04T10:32:19.4093171Z * [new branch] gh/tugsbayasgalan/68/orig -> origin/gh/tugsbayasgalan/68/orig 2025-12-04T10:32:19.4093251Z * [new branch] gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base 2025-12-04T10:32:19.4093332Z * [new branch] gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head 2025-12-04T10:32:19.4093410Z * [new branch] gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig 2025-12-04T10:32:19.4093493Z * [new branch] gh/tugsbayasgalan/70/base -> origin/gh/tugsbayasgalan/70/base 2025-12-04T10:32:19.4093575Z * [new branch] gh/tugsbayasgalan/70/head -> origin/gh/tugsbayasgalan/70/head 2025-12-04T10:32:19.4093658Z * [new branch] gh/tugsbayasgalan/70/orig -> origin/gh/tugsbayasgalan/70/orig 2025-12-04T10:32:19.4093739Z * [new branch] gh/tugsbayasgalan/71/base -> origin/gh/tugsbayasgalan/71/base 2025-12-04T10:32:19.4093819Z * [new branch] gh/tugsbayasgalan/71/head -> origin/gh/tugsbayasgalan/71/head 2025-12-04T10:32:19.4093900Z * [new branch] gh/tugsbayasgalan/71/orig -> origin/gh/tugsbayasgalan/71/orig 2025-12-04T10:32:19.4093987Z * [new branch] gh/tugsbayasgalan/72/base -> origin/gh/tugsbayasgalan/72/base 2025-12-04T10:32:19.4094069Z * [new branch] gh/tugsbayasgalan/72/head -> origin/gh/tugsbayasgalan/72/head 2025-12-04T10:32:19.4094150Z * [new branch] gh/tugsbayasgalan/72/orig -> origin/gh/tugsbayasgalan/72/orig 2025-12-04T10:32:19.4094233Z * [new branch] gh/tugsbayasgalan/73/base -> origin/gh/tugsbayasgalan/73/base 2025-12-04T10:32:19.4094315Z * [new branch] gh/tugsbayasgalan/73/head -> origin/gh/tugsbayasgalan/73/head 2025-12-04T10:32:19.4094398Z * [new branch] gh/tugsbayasgalan/73/orig -> origin/gh/tugsbayasgalan/73/orig 2025-12-04T10:32:19.4094478Z * [new branch] gh/tugsbayasgalan/74/base -> origin/gh/tugsbayasgalan/74/base 2025-12-04T10:32:19.4094559Z * [new branch] gh/tugsbayasgalan/74/head -> origin/gh/tugsbayasgalan/74/head 2025-12-04T10:32:19.4094642Z * [new branch] gh/tugsbayasgalan/74/orig -> origin/gh/tugsbayasgalan/74/orig 2025-12-04T10:32:19.4094750Z * [new branch] gh/tugsbayasgalan/75/base -> origin/gh/tugsbayasgalan/75/base 2025-12-04T10:32:19.4094831Z * [new branch] gh/tugsbayasgalan/75/head -> origin/gh/tugsbayasgalan/75/head 2025-12-04T10:32:19.4094914Z * [new branch] gh/tugsbayasgalan/75/orig -> origin/gh/tugsbayasgalan/75/orig 2025-12-04T10:32:19.4094997Z * [new branch] gh/tugsbayasgalan/76/base -> origin/gh/tugsbayasgalan/76/base 2025-12-04T10:32:19.4095078Z * [new branch] gh/tugsbayasgalan/76/head -> origin/gh/tugsbayasgalan/76/head 2025-12-04T10:32:19.4095159Z * [new branch] gh/tugsbayasgalan/76/orig -> origin/gh/tugsbayasgalan/76/orig 2025-12-04T10:32:19.4095240Z * [new branch] gh/tugsbayasgalan/77/base -> origin/gh/tugsbayasgalan/77/base 2025-12-04T10:32:19.4095320Z * [new branch] gh/tugsbayasgalan/77/head -> origin/gh/tugsbayasgalan/77/head 2025-12-04T10:32:19.4095405Z * [new branch] gh/tugsbayasgalan/77/orig -> origin/gh/tugsbayasgalan/77/orig 2025-12-04T10:32:19.4095487Z * [new branch] gh/tugsbayasgalan/78/base -> origin/gh/tugsbayasgalan/78/base 2025-12-04T10:32:19.4095567Z * [new branch] gh/tugsbayasgalan/78/head -> origin/gh/tugsbayasgalan/78/head 2025-12-04T10:32:19.4095678Z * [new branch] gh/tugsbayasgalan/78/orig -> origin/gh/tugsbayasgalan/78/orig 2025-12-04T10:32:19.4095759Z * [new branch] gh/tugsbayasgalan/79/base -> origin/gh/tugsbayasgalan/79/base 2025-12-04T10:32:19.4095840Z * [new branch] gh/tugsbayasgalan/79/head -> origin/gh/tugsbayasgalan/79/head 2025-12-04T10:32:19.4095924Z * [new branch] gh/tugsbayasgalan/79/orig -> origin/gh/tugsbayasgalan/79/orig 2025-12-04T10:32:19.4096004Z * [new branch] gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base 2025-12-04T10:32:19.4096086Z * [new branch] gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head 2025-12-04T10:32:19.4096165Z * [new branch] gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig 2025-12-04T10:32:19.4096247Z * [new branch] gh/tugsbayasgalan/80/base -> origin/gh/tugsbayasgalan/80/base 2025-12-04T10:32:19.4096328Z * [new branch] gh/tugsbayasgalan/80/head -> origin/gh/tugsbayasgalan/80/head 2025-12-04T10:32:19.4096412Z * [new branch] gh/tugsbayasgalan/80/orig -> origin/gh/tugsbayasgalan/80/orig 2025-12-04T10:32:19.4096493Z * [new branch] gh/tugsbayasgalan/81/base -> origin/gh/tugsbayasgalan/81/base 2025-12-04T10:32:19.4096574Z * [new branch] gh/tugsbayasgalan/81/head -> origin/gh/tugsbayasgalan/81/head 2025-12-04T10:32:19.4096655Z * [new branch] gh/tugsbayasgalan/81/orig -> origin/gh/tugsbayasgalan/81/orig 2025-12-04T10:32:19.4096736Z * [new branch] gh/tugsbayasgalan/82/base -> origin/gh/tugsbayasgalan/82/base 2025-12-04T10:32:19.4096818Z * [new branch] gh/tugsbayasgalan/82/head -> origin/gh/tugsbayasgalan/82/head 2025-12-04T10:32:19.4096900Z * [new branch] gh/tugsbayasgalan/82/orig -> origin/gh/tugsbayasgalan/82/orig 2025-12-04T10:32:19.4096980Z * [new branch] gh/tugsbayasgalan/83/base -> origin/gh/tugsbayasgalan/83/base 2025-12-04T10:32:19.4097062Z * [new branch] gh/tugsbayasgalan/83/head -> origin/gh/tugsbayasgalan/83/head 2025-12-04T10:32:19.4097143Z * [new branch] gh/tugsbayasgalan/83/orig -> origin/gh/tugsbayasgalan/83/orig 2025-12-04T10:32:19.4097222Z * [new branch] gh/tugsbayasgalan/84/base -> origin/gh/tugsbayasgalan/84/base 2025-12-04T10:32:19.4097305Z * [new branch] gh/tugsbayasgalan/84/head -> origin/gh/tugsbayasgalan/84/head 2025-12-04T10:32:19.4097387Z * [new branch] gh/tugsbayasgalan/84/orig -> origin/gh/tugsbayasgalan/84/orig 2025-12-04T10:32:19.4097496Z * [new branch] gh/tugsbayasgalan/85/base -> origin/gh/tugsbayasgalan/85/base 2025-12-04T10:32:19.4097578Z * [new branch] gh/tugsbayasgalan/85/head -> origin/gh/tugsbayasgalan/85/head 2025-12-04T10:32:19.4097659Z * [new branch] gh/tugsbayasgalan/85/orig -> origin/gh/tugsbayasgalan/85/orig 2025-12-04T10:32:19.4097740Z * [new branch] gh/tugsbayasgalan/86/base -> origin/gh/tugsbayasgalan/86/base 2025-12-04T10:32:19.4097820Z * [new branch] gh/tugsbayasgalan/86/head -> origin/gh/tugsbayasgalan/86/head 2025-12-04T10:32:19.4097900Z * [new branch] gh/tugsbayasgalan/86/orig -> origin/gh/tugsbayasgalan/86/orig 2025-12-04T10:32:19.4097982Z * [new branch] gh/tugsbayasgalan/87/base -> origin/gh/tugsbayasgalan/87/base 2025-12-04T10:32:19.4098062Z * [new branch] gh/tugsbayasgalan/87/head -> origin/gh/tugsbayasgalan/87/head 2025-12-04T10:32:19.4098142Z * [new branch] gh/tugsbayasgalan/87/orig -> origin/gh/tugsbayasgalan/87/orig 2025-12-04T10:32:19.4098225Z * [new branch] gh/tugsbayasgalan/88/base -> origin/gh/tugsbayasgalan/88/base 2025-12-04T10:32:19.4098305Z * [new branch] gh/tugsbayasgalan/88/head -> origin/gh/tugsbayasgalan/88/head 2025-12-04T10:32:19.4098385Z * [new branch] gh/tugsbayasgalan/88/orig -> origin/gh/tugsbayasgalan/88/orig 2025-12-04T10:32:19.4098497Z * [new branch] gh/tugsbayasgalan/89/base -> origin/gh/tugsbayasgalan/89/base 2025-12-04T10:32:19.4098579Z * [new branch] gh/tugsbayasgalan/89/head -> origin/gh/tugsbayasgalan/89/head 2025-12-04T10:32:19.4098660Z * [new branch] gh/tugsbayasgalan/89/orig -> origin/gh/tugsbayasgalan/89/orig 2025-12-04T10:32:19.4098740Z * [new branch] gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base 2025-12-04T10:32:19.4098819Z * [new branch] gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head 2025-12-04T10:32:19.4098899Z * [new branch] gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig 2025-12-04T10:32:19.4098982Z * [new branch] gh/tugsbayasgalan/90/base -> origin/gh/tugsbayasgalan/90/base 2025-12-04T10:32:19.4099063Z * [new branch] gh/tugsbayasgalan/90/head -> origin/gh/tugsbayasgalan/90/head 2025-12-04T10:32:19.4099146Z * [new branch] gh/tugsbayasgalan/90/orig -> origin/gh/tugsbayasgalan/90/orig 2025-12-04T10:32:19.4099227Z * [new branch] gh/tugsbayasgalan/91/base -> origin/gh/tugsbayasgalan/91/base 2025-12-04T10:32:19.4099307Z * [new branch] gh/tugsbayasgalan/91/head -> origin/gh/tugsbayasgalan/91/head 2025-12-04T10:32:19.4099389Z * [new branch] gh/tugsbayasgalan/91/orig -> origin/gh/tugsbayasgalan/91/orig 2025-12-04T10:32:19.4099471Z * [new branch] gh/tugsbayasgalan/92/base -> origin/gh/tugsbayasgalan/92/base 2025-12-04T10:32:19.4099552Z * [new branch] gh/tugsbayasgalan/92/head -> origin/gh/tugsbayasgalan/92/head 2025-12-04T10:32:19.4099667Z * [new branch] gh/tugsbayasgalan/92/orig -> origin/gh/tugsbayasgalan/92/orig 2025-12-04T10:32:19.4099752Z * [new branch] gh/tugsbayasgalan/93/base -> origin/gh/tugsbayasgalan/93/base 2025-12-04T10:32:19.4099834Z * [new branch] gh/tugsbayasgalan/93/head -> origin/gh/tugsbayasgalan/93/head 2025-12-04T10:32:19.4099916Z * [new branch] gh/tugsbayasgalan/93/orig -> origin/gh/tugsbayasgalan/93/orig 2025-12-04T10:32:19.4099984Z * [new branch] gh/v0i0/14/base -> origin/gh/v0i0/14/base 2025-12-04T10:32:19.4100048Z * [new branch] gh/v0i0/14/head -> origin/gh/v0i0/14/head 2025-12-04T10:32:19.4100116Z * [new branch] gh/v0i0/14/orig -> origin/gh/v0i0/14/orig 2025-12-04T10:32:19.4100180Z * [new branch] gh/v0i0/15/base -> origin/gh/v0i0/15/base 2025-12-04T10:32:19.4100290Z * [new branch] gh/v0i0/15/head -> origin/gh/v0i0/15/head 2025-12-04T10:32:19.4100351Z * [new branch] gh/v0i0/15/orig -> origin/gh/v0i0/15/orig 2025-12-04T10:32:19.4100412Z * [new branch] gh/v0i0/16/base -> origin/gh/v0i0/16/base 2025-12-04T10:32:19.4100473Z * [new branch] gh/v0i0/16/head -> origin/gh/v0i0/16/head 2025-12-04T10:32:19.4100538Z * [new branch] gh/v0i0/16/orig -> origin/gh/v0i0/16/orig 2025-12-04T10:32:19.4100599Z * [new branch] gh/v0i0/17/base -> origin/gh/v0i0/17/base 2025-12-04T10:32:19.4100663Z * [new branch] gh/v0i0/17/head -> origin/gh/v0i0/17/head 2025-12-04T10:32:19.4100726Z * [new branch] gh/v0i0/17/orig -> origin/gh/v0i0/17/orig 2025-12-04T10:32:19.4100787Z * [new branch] gh/v0i0/18/base -> origin/gh/v0i0/18/base 2025-12-04T10:32:19.4100848Z * [new branch] gh/v0i0/18/head -> origin/gh/v0i0/18/head 2025-12-04T10:32:19.4100912Z * [new branch] gh/v0i0/18/orig -> origin/gh/v0i0/18/orig 2025-12-04T10:32:19.4100973Z * [new branch] gh/v0i0/19/base -> origin/gh/v0i0/19/base 2025-12-04T10:32:19.4101036Z * [new branch] gh/v0i0/19/head -> origin/gh/v0i0/19/head 2025-12-04T10:32:19.4101132Z * [new branch] gh/v0i0/19/orig -> origin/gh/v0i0/19/orig 2025-12-04T10:32:19.4101211Z * [new branch] gh/vishal9-team/1/base -> origin/gh/vishal9-team/1/base 2025-12-04T10:32:19.4101288Z * [new branch] gh/vishal9-team/1/head -> origin/gh/vishal9-team/1/head 2025-12-04T10:32:19.4101362Z * [new branch] gh/vishal9-team/2/base -> origin/gh/vishal9-team/2/base 2025-12-04T10:32:19.4101435Z * [new branch] gh/vishal9-team/2/head -> origin/gh/vishal9-team/2/head 2025-12-04T10:32:19.4101509Z * [new branch] gh/vishal9-team/2/orig -> origin/gh/vishal9-team/2/orig 2025-12-04T10:32:19.4101584Z * [new branch] gh/vishal9-team/3/base -> origin/gh/vishal9-team/3/base 2025-12-04T10:32:19.4101659Z * [new branch] gh/vishal9-team/3/head -> origin/gh/vishal9-team/3/head 2025-12-04T10:32:19.4101732Z * [new branch] gh/vishal9-team/3/orig -> origin/gh/vishal9-team/3/orig 2025-12-04T10:32:19.4101805Z * [new branch] gh/vishal9-team/4/base -> origin/gh/vishal9-team/4/base 2025-12-04T10:32:19.4101877Z * [new branch] gh/vishal9-team/4/head -> origin/gh/vishal9-team/4/head 2025-12-04T10:32:19.4101951Z * [new branch] gh/vishal9-team/4/orig -> origin/gh/vishal9-team/4/orig 2025-12-04T10:32:19.4102017Z * [new branch] gh/vkuzo/1/next -> origin/gh/vkuzo/1/next 2025-12-04T10:32:19.4102081Z * [new branch] gh/vkuzo/2/next -> origin/gh/vkuzo/2/next 2025-12-04T10:32:19.4102151Z * [new branch] gh/vkuzo/3/next -> origin/gh/vkuzo/3/next 2025-12-04T10:32:19.4102223Z * [new branch] gh/wconstab/424/base -> origin/gh/wconstab/424/base 2025-12-04T10:32:19.4102298Z * [new branch] gh/wconstab/424/head -> origin/gh/wconstab/424/head 2025-12-04T10:32:19.4102371Z * [new branch] gh/wconstab/424/orig -> origin/gh/wconstab/424/orig 2025-12-04T10:32:19.4102443Z * [new branch] gh/wconstab/435/base -> origin/gh/wconstab/435/base 2025-12-04T10:32:19.4102515Z * [new branch] gh/wconstab/435/head -> origin/gh/wconstab/435/head 2025-12-04T10:32:19.4102584Z * [new branch] gh/wconstab/435/orig -> origin/gh/wconstab/435/orig 2025-12-04T10:32:19.4102654Z * [new branch] gh/wconstab/444/base -> origin/gh/wconstab/444/base 2025-12-04T10:32:19.4102724Z * [new branch] gh/wconstab/444/head -> origin/gh/wconstab/444/head 2025-12-04T10:32:19.4102832Z * [new branch] gh/wconstab/444/orig -> origin/gh/wconstab/444/orig 2025-12-04T10:32:19.4102902Z * [new branch] gh/wconstab/447/base -> origin/gh/wconstab/447/base 2025-12-04T10:32:19.4102972Z * [new branch] gh/wconstab/447/head -> origin/gh/wconstab/447/head 2025-12-04T10:32:19.4103041Z * [new branch] gh/wconstab/447/orig -> origin/gh/wconstab/447/orig 2025-12-04T10:32:19.4103113Z * [new branch] gh/wconstab/448/base -> origin/gh/wconstab/448/base 2025-12-04T10:32:19.4103184Z * [new branch] gh/wconstab/448/head -> origin/gh/wconstab/448/head 2025-12-04T10:32:19.4103255Z * [new branch] gh/wconstab/448/orig -> origin/gh/wconstab/448/orig 2025-12-04T10:32:19.4103324Z * [new branch] gh/wconstab/449/base -> origin/gh/wconstab/449/base 2025-12-04T10:32:19.4103395Z * [new branch] gh/wconstab/449/head -> origin/gh/wconstab/449/head 2025-12-04T10:32:19.4103467Z * [new branch] gh/wconstab/449/orig -> origin/gh/wconstab/449/orig 2025-12-04T10:32:19.4103536Z * [new branch] gh/wconstab/450/base -> origin/gh/wconstab/450/base 2025-12-04T10:32:19.4103608Z * [new branch] gh/wconstab/450/head -> origin/gh/wconstab/450/head 2025-12-04T10:32:19.4103678Z * [new branch] gh/wconstab/450/orig -> origin/gh/wconstab/450/orig 2025-12-04T10:32:19.4103778Z * [new branch] gh/wconstab/451/base -> origin/gh/wconstab/451/base 2025-12-04T10:32:19.4103851Z * [new branch] gh/wconstab/451/head -> origin/gh/wconstab/451/head 2025-12-04T10:32:19.4103921Z * [new branch] gh/wconstab/451/orig -> origin/gh/wconstab/451/orig 2025-12-04T10:32:19.4103993Z * [new branch] gh/wconstab/452/base -> origin/gh/wconstab/452/base 2025-12-04T10:32:19.4104064Z * [new branch] gh/wconstab/452/head -> origin/gh/wconstab/452/head 2025-12-04T10:32:19.4104134Z * [new branch] gh/wconstab/452/orig -> origin/gh/wconstab/452/orig 2025-12-04T10:32:19.4104205Z * [new branch] gh/wconstab/453/base -> origin/gh/wconstab/453/base 2025-12-04T10:32:19.4104276Z * [new branch] gh/wconstab/453/head -> origin/gh/wconstab/453/head 2025-12-04T10:32:19.4104347Z * [new branch] gh/wconstab/453/orig -> origin/gh/wconstab/453/orig 2025-12-04T10:32:19.4104421Z * [new branch] gh/wconstab/454/base -> origin/gh/wconstab/454/base 2025-12-04T10:32:19.4104490Z * [new branch] gh/wconstab/454/head -> origin/gh/wconstab/454/head 2025-12-04T10:32:19.4104559Z * [new branch] gh/wconstab/454/orig -> origin/gh/wconstab/454/orig 2025-12-04T10:32:19.4104632Z * [new branch] gh/wconstab/455/base -> origin/gh/wconstab/455/base 2025-12-04T10:32:19.4104701Z * [new branch] gh/wconstab/455/head -> origin/gh/wconstab/455/head 2025-12-04T10:32:19.4104772Z * [new branch] gh/wconstab/455/orig -> origin/gh/wconstab/455/orig 2025-12-04T10:32:19.4104844Z * [new branch] gh/wconstab/456/base -> origin/gh/wconstab/456/base 2025-12-04T10:32:19.4104913Z * [new branch] gh/wconstab/456/head -> origin/gh/wconstab/456/head 2025-12-04T10:32:19.4104984Z * [new branch] gh/wconstab/456/orig -> origin/gh/wconstab/456/orig 2025-12-04T10:32:19.4105056Z * [new branch] gh/wconstab/457/base -> origin/gh/wconstab/457/base 2025-12-04T10:32:19.4105127Z * [new branch] gh/wconstab/457/head -> origin/gh/wconstab/457/head 2025-12-04T10:32:19.4105197Z * [new branch] gh/wconstab/457/orig -> origin/gh/wconstab/457/orig 2025-12-04T10:32:19.4105269Z * [new branch] gh/wconstab/458/base -> origin/gh/wconstab/458/base 2025-12-04T10:32:19.4105340Z * [new branch] gh/wconstab/458/head -> origin/gh/wconstab/458/head 2025-12-04T10:32:19.4105439Z * [new branch] gh/wconstab/458/orig -> origin/gh/wconstab/458/orig 2025-12-04T10:32:19.4105508Z * [new branch] gh/wconstab/459/base -> origin/gh/wconstab/459/base 2025-12-04T10:32:19.4105578Z * [new branch] gh/wconstab/459/head -> origin/gh/wconstab/459/head 2025-12-04T10:32:19.4105650Z * [new branch] gh/wconstab/459/orig -> origin/gh/wconstab/459/orig 2025-12-04T10:32:19.4105720Z * [new branch] gh/wconstab/460/base -> origin/gh/wconstab/460/base 2025-12-04T10:32:19.4105788Z * [new branch] gh/wconstab/460/head -> origin/gh/wconstab/460/head 2025-12-04T10:32:19.4105859Z * [new branch] gh/wconstab/460/orig -> origin/gh/wconstab/460/orig 2025-12-04T10:32:19.4105928Z * [new branch] gh/wconstab/461/base -> origin/gh/wconstab/461/base 2025-12-04T10:32:19.4105997Z * [new branch] gh/wconstab/461/head -> origin/gh/wconstab/461/head 2025-12-04T10:32:19.4106069Z * [new branch] gh/wconstab/461/orig -> origin/gh/wconstab/461/orig 2025-12-04T10:32:19.4106139Z * [new branch] gh/wconstab/462/base -> origin/gh/wconstab/462/base 2025-12-04T10:32:19.4106208Z * [new branch] gh/wconstab/462/head -> origin/gh/wconstab/462/head 2025-12-04T10:32:19.4106309Z * [new branch] gh/wconstab/462/orig -> origin/gh/wconstab/462/orig 2025-12-04T10:32:19.4106380Z * [new branch] gh/wconstab/463/base -> origin/gh/wconstab/463/base 2025-12-04T10:32:19.4106449Z * [new branch] gh/wconstab/463/head -> origin/gh/wconstab/463/head 2025-12-04T10:32:19.4106519Z * [new branch] gh/wconstab/463/orig -> origin/gh/wconstab/463/orig 2025-12-04T10:32:19.4106589Z * [new branch] gh/wconstab/464/base -> origin/gh/wconstab/464/base 2025-12-04T10:32:19.4106658Z * [new branch] gh/wconstab/464/head -> origin/gh/wconstab/464/head 2025-12-04T10:32:19.4106732Z * [new branch] gh/wconstab/464/orig -> origin/gh/wconstab/464/orig 2025-12-04T10:32:19.4106802Z * [new branch] gh/wconstab/465/base -> origin/gh/wconstab/465/base 2025-12-04T10:32:19.4106874Z * [new branch] gh/wconstab/465/head -> origin/gh/wconstab/465/head 2025-12-04T10:32:19.4106944Z * [new branch] gh/wconstab/465/orig -> origin/gh/wconstab/465/orig 2025-12-04T10:32:19.4107013Z * [new branch] gh/wconstab/466/base -> origin/gh/wconstab/466/base 2025-12-04T10:32:19.4107085Z * [new branch] gh/wconstab/466/head -> origin/gh/wconstab/466/head 2025-12-04T10:32:19.4107155Z * [new branch] gh/wconstab/466/orig -> origin/gh/wconstab/466/orig 2025-12-04T10:32:19.4107225Z * [new branch] gh/wconstab/467/base -> origin/gh/wconstab/467/base 2025-12-04T10:32:19.4107295Z * [new branch] gh/wconstab/467/head -> origin/gh/wconstab/467/head 2025-12-04T10:32:19.4107368Z * [new branch] gh/wconstab/467/orig -> origin/gh/wconstab/467/orig 2025-12-04T10:32:19.4107437Z * [new branch] gh/wconstab/468/base -> origin/gh/wconstab/468/base 2025-12-04T10:32:19.4107508Z * [new branch] gh/wconstab/468/head -> origin/gh/wconstab/468/head 2025-12-04T10:32:19.4107578Z * [new branch] gh/wconstab/468/orig -> origin/gh/wconstab/468/orig 2025-12-04T10:32:19.4107650Z * [new branch] gh/weifengpy/39/base -> origin/gh/weifengpy/39/base 2025-12-04T10:32:19.4107722Z * [new branch] gh/weifengpy/39/head -> origin/gh/weifengpy/39/head 2025-12-04T10:32:19.4107794Z * [new branch] gh/weifengpy/39/orig -> origin/gh/weifengpy/39/orig 2025-12-04T10:32:19.4107865Z * [new branch] gh/weifengpy/40/base -> origin/gh/weifengpy/40/base 2025-12-04T10:32:19.4107937Z * [new branch] gh/weifengpy/40/head -> origin/gh/weifengpy/40/head 2025-12-04T10:32:19.4108035Z * [new branch] gh/weifengpy/40/orig -> origin/gh/weifengpy/40/orig 2025-12-04T10:32:19.4108105Z * [new branch] gh/weifengpy/41/base -> origin/gh/weifengpy/41/base 2025-12-04T10:32:19.4108177Z * [new branch] gh/weifengpy/41/head -> origin/gh/weifengpy/41/head 2025-12-04T10:32:19.4108250Z * [new branch] gh/weifengpy/41/orig -> origin/gh/weifengpy/41/orig 2025-12-04T10:32:19.4108332Z * [new branch] gh/williamwen42/250/base -> origin/gh/williamwen42/250/base 2025-12-04T10:32:19.4108413Z * [new branch] gh/williamwen42/250/head -> origin/gh/williamwen42/250/head 2025-12-04T10:32:19.4108493Z * [new branch] gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig 2025-12-04T10:32:19.4108575Z * [new branch] gh/williamwen42/279/base -> origin/gh/williamwen42/279/base 2025-12-04T10:32:19.4108655Z * [new branch] gh/williamwen42/279/head -> origin/gh/williamwen42/279/head 2025-12-04T10:32:19.4108731Z * [new branch] gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig 2025-12-04T10:32:19.4108808Z * [new branch] gh/williamwen42/282/base -> origin/gh/williamwen42/282/base 2025-12-04T10:32:19.4108887Z * [new branch] gh/williamwen42/282/head -> origin/gh/williamwen42/282/head 2025-12-04T10:32:19.4108994Z * [new branch] gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig 2025-12-04T10:32:19.4109073Z * [new branch] gh/williamwen42/287/base -> origin/gh/williamwen42/287/base 2025-12-04T10:32:19.4109150Z * [new branch] gh/williamwen42/287/head -> origin/gh/williamwen42/287/head 2025-12-04T10:32:19.4109227Z * [new branch] gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig 2025-12-04T10:32:19.4109304Z * [new branch] gh/williamwen42/288/base -> origin/gh/williamwen42/288/base 2025-12-04T10:32:19.4109384Z * [new branch] gh/williamwen42/288/head -> origin/gh/williamwen42/288/head 2025-12-04T10:32:19.4109462Z * [new branch] gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig 2025-12-04T10:32:19.4109542Z * [new branch] gh/williamwen42/296/base -> origin/gh/williamwen42/296/base 2025-12-04T10:32:19.4109658Z * [new branch] gh/williamwen42/296/head -> origin/gh/williamwen42/296/head 2025-12-04T10:32:19.4109739Z * [new branch] gh/williamwen42/296/orig -> origin/gh/williamwen42/296/orig 2025-12-04T10:32:19.4109817Z * [new branch] gh/williamwen42/297/base -> origin/gh/williamwen42/297/base 2025-12-04T10:32:19.4109895Z * [new branch] gh/williamwen42/297/head -> origin/gh/williamwen42/297/head 2025-12-04T10:32:19.4109973Z * [new branch] gh/williamwen42/297/orig -> origin/gh/williamwen42/297/orig 2025-12-04T10:32:19.4110052Z * [new branch] gh/williamwen42/306/base -> origin/gh/williamwen42/306/base 2025-12-04T10:32:19.4110129Z * [new branch] gh/williamwen42/306/head -> origin/gh/williamwen42/306/head 2025-12-04T10:32:19.4110209Z * [new branch] gh/williamwen42/306/orig -> origin/gh/williamwen42/306/orig 2025-12-04T10:32:19.4110285Z * [new branch] gh/williamwen42/309/base -> origin/gh/williamwen42/309/base 2025-12-04T10:32:19.4110363Z * [new branch] gh/williamwen42/309/head -> origin/gh/williamwen42/309/head 2025-12-04T10:32:19.4110442Z * [new branch] gh/williamwen42/309/orig -> origin/gh/williamwen42/309/orig 2025-12-04T10:32:19.4110520Z * [new branch] gh/williamwen42/310/base -> origin/gh/williamwen42/310/base 2025-12-04T10:32:19.4110598Z * [new branch] gh/williamwen42/310/head -> origin/gh/williamwen42/310/head 2025-12-04T10:32:19.4110677Z * [new branch] gh/williamwen42/310/orig -> origin/gh/williamwen42/310/orig 2025-12-04T10:32:19.4110797Z * [new branch] gh/williamwen42/311/base -> origin/gh/williamwen42/311/base 2025-12-04T10:32:19.4110875Z * [new branch] gh/williamwen42/311/head -> origin/gh/williamwen42/311/head 2025-12-04T10:32:19.4110958Z * [new branch] gh/williamwen42/311/orig -> origin/gh/williamwen42/311/orig 2025-12-04T10:32:19.4111036Z * [new branch] gh/williamwen42/319/base -> origin/gh/williamwen42/319/base 2025-12-04T10:32:19.4111115Z * [new branch] gh/williamwen42/319/head -> origin/gh/williamwen42/319/head 2025-12-04T10:32:19.4111195Z * [new branch] gh/williamwen42/319/orig -> origin/gh/williamwen42/319/orig 2025-12-04T10:32:19.4111273Z * [new branch] gh/williamwen42/325/base -> origin/gh/williamwen42/325/base 2025-12-04T10:32:19.4111351Z * [new branch] gh/williamwen42/325/head -> origin/gh/williamwen42/325/head 2025-12-04T10:32:19.4111430Z * [new branch] gh/williamwen42/325/orig -> origin/gh/williamwen42/325/orig 2025-12-04T10:32:19.4111510Z * [new branch] gh/williamwen42/326/base -> origin/gh/williamwen42/326/base 2025-12-04T10:32:19.4111589Z * [new branch] gh/williamwen42/326/head -> origin/gh/williamwen42/326/head 2025-12-04T10:32:19.4111665Z * [new branch] gh/williamwen42/326/orig -> origin/gh/williamwen42/326/orig 2025-12-04T10:32:19.4111797Z * [new branch] gh/williamwen42/327/base -> origin/gh/williamwen42/327/base 2025-12-04T10:32:19.4111878Z * [new branch] gh/williamwen42/327/head -> origin/gh/williamwen42/327/head 2025-12-04T10:32:19.4111956Z * [new branch] gh/williamwen42/327/orig -> origin/gh/williamwen42/327/orig 2025-12-04T10:32:19.4112032Z * [new branch] gh/williamwen42/328/base -> origin/gh/williamwen42/328/base 2025-12-04T10:32:19.4112111Z * [new branch] gh/williamwen42/328/head -> origin/gh/williamwen42/328/head 2025-12-04T10:32:19.4112189Z * [new branch] gh/williamwen42/328/orig -> origin/gh/williamwen42/328/orig 2025-12-04T10:32:19.4112266Z * [new branch] gh/williamwen42/329/base -> origin/gh/williamwen42/329/base 2025-12-04T10:32:19.4112344Z * [new branch] gh/williamwen42/329/head -> origin/gh/williamwen42/329/head 2025-12-04T10:32:19.4112424Z * [new branch] gh/williamwen42/329/orig -> origin/gh/williamwen42/329/orig 2025-12-04T10:32:19.4112500Z * [new branch] gh/williamwen42/330/base -> origin/gh/williamwen42/330/base 2025-12-04T10:32:19.4112577Z * [new branch] gh/williamwen42/330/head -> origin/gh/williamwen42/330/head 2025-12-04T10:32:19.4112653Z * [new branch] gh/williamwen42/330/orig -> origin/gh/williamwen42/330/orig 2025-12-04T10:32:19.4112731Z * [new branch] gh/williamwen42/331/base -> origin/gh/williamwen42/331/base 2025-12-04T10:32:19.4112809Z * [new branch] gh/williamwen42/331/head -> origin/gh/williamwen42/331/head 2025-12-04T10:32:19.4112889Z * [new branch] gh/williamwen42/331/orig -> origin/gh/williamwen42/331/orig 2025-12-04T10:32:19.4112966Z * [new branch] gh/williamwen42/332/base -> origin/gh/williamwen42/332/base 2025-12-04T10:32:19.4113043Z * [new branch] gh/williamwen42/332/head -> origin/gh/williamwen42/332/head 2025-12-04T10:32:19.4113120Z * [new branch] gh/williamwen42/332/orig -> origin/gh/williamwen42/332/orig 2025-12-04T10:32:19.4113199Z * [new branch] gh/williamwen42/333/base -> origin/gh/williamwen42/333/base 2025-12-04T10:32:19.4113275Z * [new branch] gh/williamwen42/333/head -> origin/gh/williamwen42/333/head 2025-12-04T10:32:19.4113352Z * [new branch] gh/williamwen42/333/orig -> origin/gh/williamwen42/333/orig 2025-12-04T10:32:19.4113431Z * [new branch] gh/williamwen42/334/base -> origin/gh/williamwen42/334/base 2025-12-04T10:32:19.4113537Z * [new branch] gh/williamwen42/334/head -> origin/gh/williamwen42/334/head 2025-12-04T10:32:19.4113616Z * [new branch] gh/williamwen42/334/orig -> origin/gh/williamwen42/334/orig 2025-12-04T10:32:19.4113694Z * [new branch] gh/williamwen42/335/base -> origin/gh/williamwen42/335/base 2025-12-04T10:32:19.4113773Z * [new branch] gh/williamwen42/335/head -> origin/gh/williamwen42/335/head 2025-12-04T10:32:19.4113850Z * [new branch] gh/williamwen42/335/orig -> origin/gh/williamwen42/335/orig 2025-12-04T10:32:19.4113929Z * [new branch] gh/williamwen42/336/base -> origin/gh/williamwen42/336/base 2025-12-04T10:32:19.4114006Z * [new branch] gh/williamwen42/336/head -> origin/gh/williamwen42/336/head 2025-12-04T10:32:19.4114083Z * [new branch] gh/williamwen42/336/orig -> origin/gh/williamwen42/336/orig 2025-12-04T10:32:19.4114162Z * [new branch] gh/williamwen42/337/base -> origin/gh/williamwen42/337/base 2025-12-04T10:32:19.4114241Z * [new branch] gh/williamwen42/337/head -> origin/gh/williamwen42/337/head 2025-12-04T10:32:19.4114319Z * [new branch] gh/williamwen42/337/orig -> origin/gh/williamwen42/337/orig 2025-12-04T10:32:19.4114398Z * [new branch] gh/williamwen42/338/base -> origin/gh/williamwen42/338/base 2025-12-04T10:32:19.4115430Z * [new branch] gh/williamwen42/338/head -> origin/gh/williamwen42/338/head 2025-12-04T10:32:19.4115510Z * [new branch] gh/williamwen42/338/orig -> origin/gh/williamwen42/338/orig 2025-12-04T10:32:19.4115586Z * [new branch] gh/williamwen42/339/base -> origin/gh/williamwen42/339/base 2025-12-04T10:32:19.4115662Z * [new branch] gh/williamwen42/339/head -> origin/gh/williamwen42/339/head 2025-12-04T10:32:19.4115739Z * [new branch] gh/williamwen42/339/orig -> origin/gh/williamwen42/339/orig 2025-12-04T10:32:19.4115817Z * [new branch] gh/williamwen42/340/base -> origin/gh/williamwen42/340/base 2025-12-04T10:32:19.4115894Z * [new branch] gh/williamwen42/340/head -> origin/gh/williamwen42/340/head 2025-12-04T10:32:19.4115974Z * [new branch] gh/williamwen42/340/orig -> origin/gh/williamwen42/340/orig 2025-12-04T10:32:19.4116052Z * [new branch] gh/williamwen42/341/base -> origin/gh/williamwen42/341/base 2025-12-04T10:32:19.4116130Z * [new branch] gh/williamwen42/341/head -> origin/gh/williamwen42/341/head 2025-12-04T10:32:19.4116208Z * [new branch] gh/williamwen42/341/orig -> origin/gh/williamwen42/341/orig 2025-12-04T10:32:19.4116288Z * [new branch] gh/williamwen42/342/base -> origin/gh/williamwen42/342/base 2025-12-04T10:32:19.4116365Z * [new branch] gh/williamwen42/342/head -> origin/gh/williamwen42/342/head 2025-12-04T10:32:19.4116443Z * [new branch] gh/williamwen42/342/orig -> origin/gh/williamwen42/342/orig 2025-12-04T10:32:19.4116521Z * [new branch] gh/williamwen42/343/base -> origin/gh/williamwen42/343/base 2025-12-04T10:32:19.4116598Z * [new branch] gh/williamwen42/343/head -> origin/gh/williamwen42/343/head 2025-12-04T10:32:19.4116679Z * [new branch] gh/williamwen42/343/orig -> origin/gh/williamwen42/343/orig 2025-12-04T10:32:19.4116760Z * [new branch] gh/williamwen42/344/base -> origin/gh/williamwen42/344/base 2025-12-04T10:32:19.4116839Z * [new branch] gh/williamwen42/344/head -> origin/gh/williamwen42/344/head 2025-12-04T10:32:19.4116916Z * [new branch] gh/williamwen42/344/orig -> origin/gh/williamwen42/344/orig 2025-12-04T10:32:19.4116993Z * [new branch] gh/williamwen42/345/base -> origin/gh/williamwen42/345/base 2025-12-04T10:32:19.4117072Z * [new branch] gh/williamwen42/345/head -> origin/gh/williamwen42/345/head 2025-12-04T10:32:19.4117178Z * [new branch] gh/williamwen42/345/orig -> origin/gh/williamwen42/345/orig 2025-12-04T10:32:19.4117255Z * [new branch] gh/williamwen42/346/base -> origin/gh/williamwen42/346/base 2025-12-04T10:32:19.4117334Z * [new branch] gh/williamwen42/346/head -> origin/gh/williamwen42/346/head 2025-12-04T10:32:19.4117414Z * [new branch] gh/williamwen42/346/orig -> origin/gh/williamwen42/346/orig 2025-12-04T10:32:19.4117492Z * [new branch] gh/williamwen42/347/base -> origin/gh/williamwen42/347/base 2025-12-04T10:32:19.4117569Z * [new branch] gh/williamwen42/347/head -> origin/gh/williamwen42/347/head 2025-12-04T10:32:19.4117645Z * [new branch] gh/williamwen42/347/orig -> origin/gh/williamwen42/347/orig 2025-12-04T10:32:19.4117721Z * [new branch] gh/williamwen42/348/base -> origin/gh/williamwen42/348/base 2025-12-04T10:32:19.4117799Z * [new branch] gh/williamwen42/348/head -> origin/gh/williamwen42/348/head 2025-12-04T10:32:19.4117878Z * [new branch] gh/williamwen42/348/orig -> origin/gh/williamwen42/348/orig 2025-12-04T10:32:19.4117955Z * [new branch] gh/williamwen42/349/base -> origin/gh/williamwen42/349/base 2025-12-04T10:32:19.4118034Z * [new branch] gh/williamwen42/349/head -> origin/gh/williamwen42/349/head 2025-12-04T10:32:19.4118135Z * [new branch] gh/williamwen42/349/orig -> origin/gh/williamwen42/349/orig 2025-12-04T10:32:19.4118213Z * [new branch] gh/williamwen42/350/base -> origin/gh/williamwen42/350/base 2025-12-04T10:32:19.4118290Z * [new branch] gh/williamwen42/350/head -> origin/gh/williamwen42/350/head 2025-12-04T10:32:19.4118367Z * [new branch] gh/williamwen42/350/orig -> origin/gh/williamwen42/350/orig 2025-12-04T10:32:19.4118446Z * [new branch] gh/williamwen42/351/base -> origin/gh/williamwen42/351/base 2025-12-04T10:32:19.4118526Z * [new branch] gh/williamwen42/351/head -> origin/gh/williamwen42/351/head 2025-12-04T10:32:19.4118602Z * [new branch] gh/williamwen42/351/orig -> origin/gh/williamwen42/351/orig 2025-12-04T10:32:19.4118683Z * [new branch] gh/williamwen42/352/base -> origin/gh/williamwen42/352/base 2025-12-04T10:32:19.4118759Z * [new branch] gh/williamwen42/352/head -> origin/gh/williamwen42/352/head 2025-12-04T10:32:19.4118837Z * [new branch] gh/williamwen42/352/orig -> origin/gh/williamwen42/352/orig 2025-12-04T10:32:19.4118918Z * [new branch] gh/williamwen42/353/base -> origin/gh/williamwen42/353/base 2025-12-04T10:32:19.4118995Z * [new branch] gh/williamwen42/353/head -> origin/gh/williamwen42/353/head 2025-12-04T10:32:19.4119071Z * [new branch] gh/williamwen42/353/orig -> origin/gh/williamwen42/353/orig 2025-12-04T10:32:19.4119150Z * [new branch] gh/williamwen42/354/base -> origin/gh/williamwen42/354/base 2025-12-04T10:32:19.4119228Z * [new branch] gh/williamwen42/354/head -> origin/gh/williamwen42/354/head 2025-12-04T10:32:19.4119305Z * [new branch] gh/williamwen42/354/orig -> origin/gh/williamwen42/354/orig 2025-12-04T10:32:19.4119385Z * [new branch] gh/williamwen42/355/base -> origin/gh/williamwen42/355/base 2025-12-04T10:32:19.4119464Z * [new branch] gh/williamwen42/355/head -> origin/gh/williamwen42/355/head 2025-12-04T10:32:19.4119541Z * [new branch] gh/williamwen42/355/orig -> origin/gh/williamwen42/355/orig 2025-12-04T10:32:19.4119656Z * [new branch] gh/williamwen42/356/base -> origin/gh/williamwen42/356/base 2025-12-04T10:32:19.4119737Z * [new branch] gh/williamwen42/356/head -> origin/gh/williamwen42/356/head 2025-12-04T10:32:19.4119815Z * [new branch] gh/williamwen42/356/orig -> origin/gh/williamwen42/356/orig 2025-12-04T10:32:19.4119942Z * [new branch] gh/williamwen42/357/base -> origin/gh/williamwen42/357/base 2025-12-04T10:32:19.4120020Z * [new branch] gh/williamwen42/357/head -> origin/gh/williamwen42/357/head 2025-12-04T10:32:19.4120100Z * [new branch] gh/williamwen42/357/orig -> origin/gh/williamwen42/357/orig 2025-12-04T10:32:19.4120180Z * [new branch] gh/williamwen42/358/base -> origin/gh/williamwen42/358/base 2025-12-04T10:32:19.4120260Z * [new branch] gh/williamwen42/358/head -> origin/gh/williamwen42/358/head 2025-12-04T10:32:19.4120340Z * [new branch] gh/williamwen42/358/orig -> origin/gh/williamwen42/358/orig 2025-12-04T10:32:19.4120409Z * [new branch] gh/xmfan/169/base -> origin/gh/xmfan/169/base 2025-12-04T10:32:19.4120475Z * [new branch] gh/xmfan/169/head -> origin/gh/xmfan/169/head 2025-12-04T10:32:19.4120543Z * [new branch] gh/xmfan/170/base -> origin/gh/xmfan/170/base 2025-12-04T10:32:19.4120611Z * [new branch] gh/xmfan/170/head -> origin/gh/xmfan/170/head 2025-12-04T10:32:19.4120678Z * [new branch] gh/xmfan/274/base -> origin/gh/xmfan/274/base 2025-12-04T10:32:19.4120743Z * [new branch] gh/xmfan/274/head -> origin/gh/xmfan/274/head 2025-12-04T10:32:19.4120808Z * [new branch] gh/xmfan/274/orig -> origin/gh/xmfan/274/orig 2025-12-04T10:32:19.4120917Z * [new branch] gh/xmfan/277/base -> origin/gh/xmfan/277/base 2025-12-04T10:32:19.4120985Z * [new branch] gh/xmfan/277/head -> origin/gh/xmfan/277/head 2025-12-04T10:32:19.4121050Z * [new branch] gh/xmfan/277/orig -> origin/gh/xmfan/277/orig 2025-12-04T10:32:19.4121116Z * [new branch] gh/xmfan/301/base -> origin/gh/xmfan/301/base 2025-12-04T10:32:19.4121183Z * [new branch] gh/xmfan/301/head -> origin/gh/xmfan/301/head 2025-12-04T10:32:19.4121249Z * [new branch] gh/xmfan/301/orig -> origin/gh/xmfan/301/orig 2025-12-04T10:32:19.4121315Z * [new branch] gh/xmfan/304/base -> origin/gh/xmfan/304/base 2025-12-04T10:32:19.4121381Z * [new branch] gh/xmfan/304/head -> origin/gh/xmfan/304/head 2025-12-04T10:32:19.4121447Z * [new branch] gh/xmfan/304/orig -> origin/gh/xmfan/304/orig 2025-12-04T10:32:19.4121516Z * [new branch] gh/xmfan/309/base -> origin/gh/xmfan/309/base 2025-12-04T10:32:19.4121582Z * [new branch] gh/xmfan/309/head -> origin/gh/xmfan/309/head 2025-12-04T10:32:19.4121647Z * [new branch] gh/xmfan/309/orig -> origin/gh/xmfan/309/orig 2025-12-04T10:32:19.4121715Z * [new branch] gh/xmfan/310/base -> origin/gh/xmfan/310/base 2025-12-04T10:32:19.4121779Z * [new branch] gh/xmfan/310/head -> origin/gh/xmfan/310/head 2025-12-04T10:32:19.4121848Z * [new branch] gh/xmfan/310/orig -> origin/gh/xmfan/310/orig 2025-12-04T10:32:19.4121915Z * [new branch] gh/xmfan/311/base -> origin/gh/xmfan/311/base 2025-12-04T10:32:19.4121980Z * [new branch] gh/xmfan/311/head -> origin/gh/xmfan/311/head 2025-12-04T10:32:19.4122044Z * [new branch] gh/xmfan/311/orig -> origin/gh/xmfan/311/orig 2025-12-04T10:32:19.4122111Z * [new branch] gh/xmfan/312/base -> origin/gh/xmfan/312/base 2025-12-04T10:32:19.4122177Z * [new branch] gh/xmfan/312/head -> origin/gh/xmfan/312/head 2025-12-04T10:32:19.4122243Z * [new branch] gh/xmfan/312/orig -> origin/gh/xmfan/312/orig 2025-12-04T10:32:19.4122310Z * [new branch] gh/xmfan/313/base -> origin/gh/xmfan/313/base 2025-12-04T10:32:19.4122376Z * [new branch] gh/xmfan/313/head -> origin/gh/xmfan/313/head 2025-12-04T10:32:19.4122440Z * [new branch] gh/xmfan/313/orig -> origin/gh/xmfan/313/orig 2025-12-04T10:32:19.4122549Z * [new branch] gh/xuanzhang816/27/base -> origin/gh/xuanzhang816/27/base 2025-12-04T10:32:19.4122627Z * [new branch] gh/xuanzhang816/27/head -> origin/gh/xuanzhang816/27/head 2025-12-04T10:32:19.4122702Z * [new branch] gh/xuanzhang816/27/orig -> origin/gh/xuanzhang816/27/orig 2025-12-04T10:32:19.4122778Z * [new branch] gh/xuanzhang816/32/base -> origin/gh/xuanzhang816/32/base 2025-12-04T10:32:19.4122854Z * [new branch] gh/xuanzhang816/32/head -> origin/gh/xuanzhang816/32/head 2025-12-04T10:32:19.4122929Z * [new branch] gh/xuanzhang816/32/orig -> origin/gh/xuanzhang816/32/orig 2025-12-04T10:32:19.4123003Z * [new branch] gh/xuanzhang816/33/base -> origin/gh/xuanzhang816/33/base 2025-12-04T10:32:19.4123076Z * [new branch] gh/xuanzhang816/33/head -> origin/gh/xuanzhang816/33/head 2025-12-04T10:32:19.4123153Z * [new branch] gh/xuanzhang816/33/orig -> origin/gh/xuanzhang816/33/orig 2025-12-04T10:32:19.4123227Z * [new branch] gh/xuanzhang816/34/base -> origin/gh/xuanzhang816/34/base 2025-12-04T10:32:19.4123300Z * [new branch] gh/xuanzhang816/34/head -> origin/gh/xuanzhang816/34/head 2025-12-04T10:32:19.4123376Z * [new branch] gh/xuanzhang816/34/orig -> origin/gh/xuanzhang816/34/orig 2025-12-04T10:32:19.4123486Z * [new branch] gh/xuanzhang816/35/base -> origin/gh/xuanzhang816/35/base 2025-12-04T10:32:19.4123559Z * [new branch] gh/xuanzhang816/35/head -> origin/gh/xuanzhang816/35/head 2025-12-04T10:32:19.4123635Z * [new branch] gh/xuanzhang816/35/orig -> origin/gh/xuanzhang816/35/orig 2025-12-04T10:32:19.4123707Z * [new branch] gh/yanbing-j/11/base -> origin/gh/yanbing-j/11/base 2025-12-04T10:32:19.4123779Z * [new branch] gh/yanbing-j/11/head -> origin/gh/yanbing-j/11/head 2025-12-04T10:32:19.4123852Z * [new branch] gh/yanbing-j/11/orig -> origin/gh/yanbing-j/11/orig 2025-12-04T10:32:19.4123921Z * [new branch] gh/yanbing-j/12/base -> origin/gh/yanbing-j/12/base 2025-12-04T10:32:19.4123989Z * [new branch] gh/yanbing-j/12/head -> origin/gh/yanbing-j/12/head 2025-12-04T10:32:19.4124061Z * [new branch] gh/yanbing-j/12/orig -> origin/gh/yanbing-j/12/orig 2025-12-04T10:32:19.4124129Z * [new branch] gh/yanbing-j/13/base -> origin/gh/yanbing-j/13/base 2025-12-04T10:32:19.4124199Z * [new branch] gh/yanbing-j/13/head -> origin/gh/yanbing-j/13/head 2025-12-04T10:32:19.4124269Z * [new branch] gh/yanbing-j/13/orig -> origin/gh/yanbing-j/13/orig 2025-12-04T10:32:19.4124339Z * [new branch] gh/yanbing-j/14/base -> origin/gh/yanbing-j/14/base 2025-12-04T10:32:19.4124408Z * [new branch] gh/yanbing-j/14/head -> origin/gh/yanbing-j/14/head 2025-12-04T10:32:19.4124479Z * [new branch] gh/yanbing-j/14/orig -> origin/gh/yanbing-j/14/orig 2025-12-04T10:32:19.4124547Z * [new branch] gh/yanbing-j/15/base -> origin/gh/yanbing-j/15/base 2025-12-04T10:32:19.4124618Z * [new branch] gh/yanbing-j/15/head -> origin/gh/yanbing-j/15/head 2025-12-04T10:32:19.4124687Z * [new branch] gh/yanbing-j/15/orig -> origin/gh/yanbing-j/15/orig 2025-12-04T10:32:19.4124754Z * [new branch] gh/yanbing-j/18/base -> origin/gh/yanbing-j/18/base 2025-12-04T10:32:19.4124825Z * [new branch] gh/yanbing-j/18/head -> origin/gh/yanbing-j/18/head 2025-12-04T10:32:19.4124894Z * [new branch] gh/yanbing-j/18/orig -> origin/gh/yanbing-j/18/orig 2025-12-04T10:32:19.4124962Z * [new branch] gh/yanbing-j/19/base -> origin/gh/yanbing-j/19/base 2025-12-04T10:32:19.4125034Z * [new branch] gh/yanbing-j/19/head -> origin/gh/yanbing-j/19/head 2025-12-04T10:32:19.4125155Z * [new branch] gh/yanbing-j/19/orig -> origin/gh/yanbing-j/19/orig 2025-12-04T10:32:19.4125224Z * [new branch] gh/yanbing-j/20/base -> origin/gh/yanbing-j/20/base 2025-12-04T10:32:19.4125296Z * [new branch] gh/yanbing-j/20/head -> origin/gh/yanbing-j/20/head 2025-12-04T10:32:19.4125367Z * [new branch] gh/yanbing-j/20/orig -> origin/gh/yanbing-j/20/orig 2025-12-04T10:32:19.4125435Z * [new branch] gh/yanbing-j/21/base -> origin/gh/yanbing-j/21/base 2025-12-04T10:32:19.4125505Z * [new branch] gh/yanbing-j/21/head -> origin/gh/yanbing-j/21/head 2025-12-04T10:32:19.4125575Z * [new branch] gh/yanbing-j/22/base -> origin/gh/yanbing-j/22/base 2025-12-04T10:32:19.4125643Z * [new branch] gh/yanbing-j/22/head -> origin/gh/yanbing-j/22/head 2025-12-04T10:32:19.4125712Z * [new branch] gh/yanbing-j/22/orig -> origin/gh/yanbing-j/22/orig 2025-12-04T10:32:19.4125782Z * [new branch] gh/yanbing-j/23/base -> origin/gh/yanbing-j/23/base 2025-12-04T10:32:19.4125851Z * [new branch] gh/yanbing-j/23/head -> origin/gh/yanbing-j/23/head 2025-12-04T10:32:19.4125923Z * [new branch] gh/yanbing-j/23/orig -> origin/gh/yanbing-j/23/orig 2025-12-04T10:32:19.4126022Z * [new branch] gh/yanbing-j/24/base -> origin/gh/yanbing-j/24/base 2025-12-04T10:32:19.4126092Z * [new branch] gh/yanbing-j/24/head -> origin/gh/yanbing-j/24/head 2025-12-04T10:32:19.4126164Z * [new branch] gh/yanbing-j/24/orig -> origin/gh/yanbing-j/24/orig 2025-12-04T10:32:19.4126232Z * [new branch] gh/yanbing-j/25/base -> origin/gh/yanbing-j/25/base 2025-12-04T10:32:19.4126302Z * [new branch] gh/yanbing-j/25/head -> origin/gh/yanbing-j/25/head 2025-12-04T10:32:19.4126371Z * [new branch] gh/yanbing-j/25/orig -> origin/gh/yanbing-j/25/orig 2025-12-04T10:32:19.4126441Z * [new branch] gh/yanbing-j/26/base -> origin/gh/yanbing-j/26/base 2025-12-04T10:32:19.4126513Z * [new branch] gh/yanbing-j/26/head -> origin/gh/yanbing-j/26/head 2025-12-04T10:32:19.4126582Z * [new branch] gh/yanbing-j/26/orig -> origin/gh/yanbing-j/26/orig 2025-12-04T10:32:19.4126662Z * [new branch] gh/yang-yu-hang/1/base -> origin/gh/yang-yu-hang/1/base 2025-12-04T10:32:19.4126740Z * [new branch] gh/yang-yu-hang/1/head -> origin/gh/yang-yu-hang/1/head 2025-12-04T10:32:19.4126813Z * [new branch] gh/yang-yu-hang/1/orig -> origin/gh/yang-yu-hang/1/orig 2025-12-04T10:32:19.4126887Z * [new branch] gh/yang-yu-hang/2/base -> origin/gh/yang-yu-hang/2/base 2025-12-04T10:32:19.4126960Z * [new branch] gh/yang-yu-hang/2/head -> origin/gh/yang-yu-hang/2/head 2025-12-04T10:32:19.4127033Z * [new branch] gh/yang-yu-hang/2/orig -> origin/gh/yang-yu-hang/2/orig 2025-12-04T10:32:19.4127106Z * [new branch] gh/yang-yu-hang/3/base -> origin/gh/yang-yu-hang/3/base 2025-12-04T10:32:19.4127182Z * [new branch] gh/yang-yu-hang/3/head -> origin/gh/yang-yu-hang/3/head 2025-12-04T10:32:19.4127254Z * [new branch] gh/yang-yu-hang/3/orig -> origin/gh/yang-yu-hang/3/orig 2025-12-04T10:32:19.4127327Z * [new branch] gh/yangw-dev/12/base -> origin/gh/yangw-dev/12/base 2025-12-04T10:32:19.4127400Z * [new branch] gh/yangw-dev/12/head -> origin/gh/yangw-dev/12/head 2025-12-04T10:32:19.4127471Z * [new branch] gh/yangw-dev/12/orig -> origin/gh/yangw-dev/12/orig 2025-12-04T10:32:19.4127541Z * [new branch] gh/yangw-dev/13/base -> origin/gh/yangw-dev/13/base 2025-12-04T10:32:19.4127615Z * [new branch] gh/yangw-dev/13/head -> origin/gh/yangw-dev/13/head 2025-12-04T10:32:19.4127712Z * [new branch] gh/yangw-dev/13/orig -> origin/gh/yangw-dev/13/orig 2025-12-04T10:32:19.4127785Z * [new branch] gh/yangw-dev/14/base -> origin/gh/yangw-dev/14/base 2025-12-04T10:32:19.4127853Z * [new branch] gh/yangw-dev/14/head -> origin/gh/yangw-dev/14/head 2025-12-04T10:32:19.4127921Z * [new branch] gh/yangw-dev/14/orig -> origin/gh/yangw-dev/14/orig 2025-12-04T10:32:19.4127992Z * [new branch] gh/yangw-dev/15/base -> origin/gh/yangw-dev/15/base 2025-12-04T10:32:19.4128060Z * [new branch] gh/yangw-dev/15/head -> origin/gh/yangw-dev/15/head 2025-12-04T10:32:19.4128128Z * [new branch] gh/yangw-dev/15/orig -> origin/gh/yangw-dev/15/orig 2025-12-04T10:32:19.4128197Z * [new branch] gh/yangw-dev/19/base -> origin/gh/yangw-dev/19/base 2025-12-04T10:32:19.4128267Z * [new branch] gh/yangw-dev/19/head -> origin/gh/yangw-dev/19/head 2025-12-04T10:32:19.4128336Z * [new branch] gh/yangw-dev/19/orig -> origin/gh/yangw-dev/19/orig 2025-12-04T10:32:19.4128409Z * [new branch] gh/yangw-dev/26/base -> origin/gh/yangw-dev/26/base 2025-12-04T10:32:19.4128478Z * [new branch] gh/yangw-dev/26/head -> origin/gh/yangw-dev/26/head 2025-12-04T10:32:19.4128546Z * [new branch] gh/yangw-dev/26/orig -> origin/gh/yangw-dev/26/orig 2025-12-04T10:32:19.4128654Z * [new branch] gh/yangw-dev/27/base -> origin/gh/yangw-dev/27/base 2025-12-04T10:32:19.4128723Z * [new branch] gh/yangw-dev/27/head -> origin/gh/yangw-dev/27/head 2025-12-04T10:32:19.4128791Z * [new branch] gh/yangw-dev/27/orig -> origin/gh/yangw-dev/27/orig 2025-12-04T10:32:19.4128860Z * [new branch] gh/ydwu4/292/base -> origin/gh/ydwu4/292/base 2025-12-04T10:32:19.4128927Z * [new branch] gh/ydwu4/292/head -> origin/gh/ydwu4/292/head 2025-12-04T10:32:19.4128995Z * [new branch] gh/ydwu4/292/orig -> origin/gh/ydwu4/292/orig 2025-12-04T10:32:19.4129062Z * [new branch] gh/ydwu4/294/base -> origin/gh/ydwu4/294/base 2025-12-04T10:32:19.4129126Z * [new branch] gh/ydwu4/294/head -> origin/gh/ydwu4/294/head 2025-12-04T10:32:19.4129190Z * [new branch] gh/ydwu4/294/orig -> origin/gh/ydwu4/294/orig 2025-12-04T10:32:19.4129258Z * [new branch] gh/ydwu4/295/base -> origin/gh/ydwu4/295/base 2025-12-04T10:32:19.4129322Z * [new branch] gh/ydwu4/295/head -> origin/gh/ydwu4/295/head 2025-12-04T10:32:19.4129387Z * [new branch] gh/ydwu4/295/orig -> origin/gh/ydwu4/295/orig 2025-12-04T10:32:19.4129453Z * [new branch] gh/ydwu4/296/base -> origin/gh/ydwu4/296/base 2025-12-04T10:32:19.4129519Z * [new branch] gh/ydwu4/296/head -> origin/gh/ydwu4/296/head 2025-12-04T10:32:19.4129620Z * [new branch] gh/ydwu4/296/orig -> origin/gh/ydwu4/296/orig 2025-12-04T10:32:19.4129688Z * [new branch] gh/ydwu4/306/base -> origin/gh/ydwu4/306/base 2025-12-04T10:32:19.4129752Z * [new branch] gh/ydwu4/306/head -> origin/gh/ydwu4/306/head 2025-12-04T10:32:19.4129818Z * [new branch] gh/ydwu4/306/orig -> origin/gh/ydwu4/306/orig 2025-12-04T10:32:19.4129883Z * [new branch] gh/ydwu4/312/base -> origin/gh/ydwu4/312/base 2025-12-04T10:32:19.4129947Z * [new branch] gh/ydwu4/312/head -> origin/gh/ydwu4/312/head 2025-12-04T10:32:19.4130014Z * [new branch] gh/ydwu4/312/orig -> origin/gh/ydwu4/312/orig 2025-12-04T10:32:19.4130078Z * [new branch] gh/ydwu4/322/base -> origin/gh/ydwu4/322/base 2025-12-04T10:32:19.4130141Z * [new branch] gh/ydwu4/322/head -> origin/gh/ydwu4/322/head 2025-12-04T10:32:19.4130250Z * [new branch] gh/ydwu4/322/orig -> origin/gh/ydwu4/322/orig 2025-12-04T10:32:19.4130314Z * [new branch] gh/ydwu4/327/base -> origin/gh/ydwu4/327/base 2025-12-04T10:32:19.4130380Z * [new branch] gh/ydwu4/327/head -> origin/gh/ydwu4/327/head 2025-12-04T10:32:19.4130446Z * [new branch] gh/ydwu4/327/orig -> origin/gh/ydwu4/327/orig 2025-12-04T10:32:19.4130513Z * [new branch] gh/ydwu4/328/base -> origin/gh/ydwu4/328/base 2025-12-04T10:32:19.4130577Z * [new branch] gh/ydwu4/328/head -> origin/gh/ydwu4/328/head 2025-12-04T10:32:19.4130643Z * [new branch] gh/ydwu4/328/orig -> origin/gh/ydwu4/328/orig 2025-12-04T10:32:19.4130707Z * [new branch] gh/ydwu4/329/base -> origin/gh/ydwu4/329/base 2025-12-04T10:32:19.4130773Z * [new branch] gh/ydwu4/329/head -> origin/gh/ydwu4/329/head 2025-12-04T10:32:19.4130839Z * [new branch] gh/ydwu4/329/orig -> origin/gh/ydwu4/329/orig 2025-12-04T10:32:19.4130906Z * [new branch] gh/ydwu4/330/base -> origin/gh/ydwu4/330/base 2025-12-04T10:32:19.4130973Z * [new branch] gh/ydwu4/330/head -> origin/gh/ydwu4/330/head 2025-12-04T10:32:19.4131037Z * [new branch] gh/ydwu4/330/orig -> origin/gh/ydwu4/330/orig 2025-12-04T10:32:19.4131127Z * [new branch] gh/ydwu4/331/base -> origin/gh/ydwu4/331/base 2025-12-04T10:32:19.4131195Z * [new branch] gh/ydwu4/331/head -> origin/gh/ydwu4/331/head 2025-12-04T10:32:19.4131260Z * [new branch] gh/ydwu4/331/orig -> origin/gh/ydwu4/331/orig 2025-12-04T10:32:19.4131325Z * [new branch] gh/ydwu4/332/base -> origin/gh/ydwu4/332/base 2025-12-04T10:32:19.4131391Z * [new branch] gh/ydwu4/332/head -> origin/gh/ydwu4/332/head 2025-12-04T10:32:19.4131455Z * [new branch] gh/ydwu4/332/orig -> origin/gh/ydwu4/332/orig 2025-12-04T10:32:19.4131523Z * [new branch] gh/ydwu4/333/base -> origin/gh/ydwu4/333/base 2025-12-04T10:32:19.4131590Z * [new branch] gh/ydwu4/333/head -> origin/gh/ydwu4/333/head 2025-12-04T10:32:19.4131655Z * [new branch] gh/ydwu4/333/orig -> origin/gh/ydwu4/333/orig 2025-12-04T10:32:19.4131722Z * [new branch] gh/ydwu4/334/base -> origin/gh/ydwu4/334/base 2025-12-04T10:32:19.4131789Z * [new branch] gh/ydwu4/334/head -> origin/gh/ydwu4/334/head 2025-12-04T10:32:19.4131853Z * [new branch] gh/ydwu4/334/orig -> origin/gh/ydwu4/334/orig 2025-12-04T10:32:19.4131918Z * [new branch] gh/ydwu4/335/base -> origin/gh/ydwu4/335/base 2025-12-04T10:32:19.4131984Z * [new branch] gh/ydwu4/335/head -> origin/gh/ydwu4/335/head 2025-12-04T10:32:19.4132049Z * [new branch] gh/ydwu4/335/orig -> origin/gh/ydwu4/335/orig 2025-12-04T10:32:19.4132115Z * [new branch] gh/ydwu4/337/base -> origin/gh/ydwu4/337/base 2025-12-04T10:32:19.4132181Z * [new branch] gh/ydwu4/337/head -> origin/gh/ydwu4/337/head 2025-12-04T10:32:19.4132246Z * [new branch] gh/ydwu4/337/orig -> origin/gh/ydwu4/337/orig 2025-12-04T10:32:19.4132312Z * [new branch] gh/ydwu4/339/base -> origin/gh/ydwu4/339/base 2025-12-04T10:32:19.4132378Z * [new branch] gh/ydwu4/339/head -> origin/gh/ydwu4/339/head 2025-12-04T10:32:19.4132443Z * [new branch] gh/ydwu4/339/orig -> origin/gh/ydwu4/339/orig 2025-12-04T10:32:19.4132509Z * [new branch] gh/yf225/133/base -> origin/gh/yf225/133/base 2025-12-04T10:32:19.4132572Z * [new branch] gh/yf225/133/head -> origin/gh/yf225/133/head 2025-12-04T10:32:19.4132636Z * [new branch] gh/yf225/93/base -> origin/gh/yf225/93/base 2025-12-04T10:32:19.4132731Z * [new branch] gh/yf225/93/head -> origin/gh/yf225/93/head 2025-12-04T10:32:19.4132804Z * [new branch] gh/yifuwang/152/base -> origin/gh/yifuwang/152/base 2025-12-04T10:32:19.4132876Z * [new branch] gh/yifuwang/152/head -> origin/gh/yifuwang/152/head 2025-12-04T10:32:19.4132951Z * [new branch] gh/yifuwang/152/orig -> origin/gh/yifuwang/152/orig 2025-12-04T10:32:19.4133021Z * [new branch] gh/yifuwang/195/base -> origin/gh/yifuwang/195/base 2025-12-04T10:32:19.4133092Z * [new branch] gh/yifuwang/195/head -> origin/gh/yifuwang/195/head 2025-12-04T10:32:19.4133166Z * [new branch] gh/yifuwang/195/orig -> origin/gh/yifuwang/195/orig 2025-12-04T10:32:19.4133239Z * [new branch] gh/yiming0416/1/base -> origin/gh/yiming0416/1/base 2025-12-04T10:32:19.4133311Z * [new branch] gh/yiming0416/1/head -> origin/gh/yiming0416/1/head 2025-12-04T10:32:19.4133383Z * [new branch] gh/yiming0416/2/base -> origin/gh/yiming0416/2/base 2025-12-04T10:32:19.4133452Z * [new branch] gh/yiming0416/2/head -> origin/gh/yiming0416/2/head 2025-12-04T10:32:19.4133526Z * [new branch] gh/yushangdi/1/base -> origin/gh/yushangdi/1/base 2025-12-04T10:32:19.4133631Z * [new branch] gh/yushangdi/1/head -> origin/gh/yushangdi/1/head 2025-12-04T10:32:19.4133703Z * [new branch] gh/yushangdi/10/base -> origin/gh/yushangdi/10/base 2025-12-04T10:32:19.4133774Z * [new branch] gh/yushangdi/10/head -> origin/gh/yushangdi/10/head 2025-12-04T10:32:19.4133848Z * [new branch] gh/yushangdi/10/orig -> origin/gh/yushangdi/10/orig 2025-12-04T10:32:19.4133919Z * [new branch] gh/yushangdi/11/base -> origin/gh/yushangdi/11/base 2025-12-04T10:32:19.4133990Z * [new branch] gh/yushangdi/11/head -> origin/gh/yushangdi/11/head 2025-12-04T10:32:19.4134062Z * [new branch] gh/yushangdi/11/orig -> origin/gh/yushangdi/11/orig 2025-12-04T10:32:19.4134134Z * [new branch] gh/yushangdi/2/base -> origin/gh/yushangdi/2/base 2025-12-04T10:32:19.4134206Z * [new branch] gh/yushangdi/2/head -> origin/gh/yushangdi/2/head 2025-12-04T10:32:19.4134278Z * [new branch] gh/yushangdi/7/base -> origin/gh/yushangdi/7/base 2025-12-04T10:32:19.4134347Z * [new branch] gh/yushangdi/7/head -> origin/gh/yushangdi/7/head 2025-12-04T10:32:19.4134418Z * [new branch] gh/yushangdi/7/orig -> origin/gh/yushangdi/7/orig 2025-12-04T10:32:19.4134487Z * [new branch] gh/yushangdi/8/base -> origin/gh/yushangdi/8/base 2025-12-04T10:32:19.4134558Z * [new branch] gh/yushangdi/8/head -> origin/gh/yushangdi/8/head 2025-12-04T10:32:19.4134629Z * [new branch] gh/yushangdi/8/orig -> origin/gh/yushangdi/8/orig 2025-12-04T10:32:19.4134702Z * [new branch] gh/yushangdi/9/base -> origin/gh/yushangdi/9/base 2025-12-04T10:32:19.4134771Z * [new branch] gh/yushangdi/9/head -> origin/gh/yushangdi/9/head 2025-12-04T10:32:19.4134841Z * [new branch] gh/yushangdi/9/orig -> origin/gh/yushangdi/9/orig 2025-12-04T10:32:19.4134909Z * [new branch] gh/zklaus/19/base -> origin/gh/zklaus/19/base 2025-12-04T10:32:19.4134976Z * [new branch] gh/zklaus/19/head -> origin/gh/zklaus/19/head 2025-12-04T10:32:19.4135042Z * [new branch] gh/zklaus/19/orig -> origin/gh/zklaus/19/orig 2025-12-04T10:32:19.4135108Z * [new branch] gh/zklaus/20/base -> origin/gh/zklaus/20/base 2025-12-04T10:32:19.4135173Z * [new branch] gh/zklaus/20/head -> origin/gh/zklaus/20/head 2025-12-04T10:32:19.4135238Z * [new branch] gh/zklaus/20/orig -> origin/gh/zklaus/20/orig 2025-12-04T10:32:19.4135332Z * [new branch] gh/zklaus/21/base -> origin/gh/zklaus/21/base 2025-12-04T10:32:19.4135397Z * [new branch] gh/zklaus/21/head -> origin/gh/zklaus/21/head 2025-12-04T10:32:19.4135462Z * [new branch] gh/zklaus/21/orig -> origin/gh/zklaus/21/orig 2025-12-04T10:32:19.4135529Z * [new branch] gh/zklaus/22/base -> origin/gh/zklaus/22/base 2025-12-04T10:32:19.4135593Z * [new branch] gh/zklaus/22/head -> origin/gh/zklaus/22/head 2025-12-04T10:32:19.4135659Z * [new branch] gh/zklaus/22/orig -> origin/gh/zklaus/22/orig 2025-12-04T10:32:19.4135723Z * [new branch] gh/zklaus/23/base -> origin/gh/zklaus/23/base 2025-12-04T10:32:19.4135790Z * [new branch] gh/zklaus/23/head -> origin/gh/zklaus/23/head 2025-12-04T10:32:19.4135854Z * [new branch] gh/zklaus/23/orig -> origin/gh/zklaus/23/orig 2025-12-04T10:32:19.4135920Z * [new branch] gh/zklaus/24/base -> origin/gh/zklaus/24/base 2025-12-04T10:32:19.4135985Z * [new branch] gh/zklaus/24/head -> origin/gh/zklaus/24/head 2025-12-04T10:32:19.4136049Z * [new branch] gh/zklaus/24/orig -> origin/gh/zklaus/24/orig 2025-12-04T10:32:19.4136147Z * [new branch] gh/zou3519/1197/base -> origin/gh/zou3519/1197/base 2025-12-04T10:32:19.4136217Z * [new branch] gh/zou3519/1197/head -> origin/gh/zou3519/1197/head 2025-12-04T10:32:19.4136289Z * [new branch] gh/zou3519/1197/orig -> origin/gh/zou3519/1197/orig 2025-12-04T10:32:19.4136358Z * [new branch] gh/zou3519/1199/base -> origin/gh/zou3519/1199/base 2025-12-04T10:32:19.4136426Z * [new branch] gh/zou3519/1199/head -> origin/gh/zou3519/1199/head 2025-12-04T10:32:19.4136494Z * [new branch] gh/zou3519/1199/orig -> origin/gh/zou3519/1199/orig 2025-12-04T10:32:19.4136563Z * [new branch] gh/zou3519/1200/base -> origin/gh/zou3519/1200/base 2025-12-04T10:32:19.4136634Z * [new branch] gh/zou3519/1200/head -> origin/gh/zou3519/1200/head 2025-12-04T10:32:19.4136702Z * [new branch] gh/zou3519/1200/orig -> origin/gh/zou3519/1200/orig 2025-12-04T10:32:19.4136771Z * [new branch] gh/zou3519/1201/base -> origin/gh/zou3519/1201/base 2025-12-04T10:32:19.4136839Z * [new branch] gh/zou3519/1201/head -> origin/gh/zou3519/1201/head 2025-12-04T10:32:19.4136907Z * [new branch] gh/zou3519/1201/orig -> origin/gh/zou3519/1201/orig 2025-12-04T10:32:19.4136975Z * [new branch] gh/zou3519/1202/base -> origin/gh/zou3519/1202/base 2025-12-04T10:32:19.4137046Z * [new branch] gh/zou3519/1202/head -> origin/gh/zou3519/1202/head 2025-12-04T10:32:19.4137112Z * [new branch] gh/zou3519/1202/orig -> origin/gh/zou3519/1202/orig 2025-12-04T10:32:19.4137181Z * [new branch] gh/zpcore/1/base -> origin/gh/zpcore/1/base 2025-12-04T10:32:19.4137250Z * [new branch] gh/zpcore/1/head -> origin/gh/zpcore/1/head 2025-12-04T10:32:19.4137316Z * [new branch] gh/zpcore/11/base -> origin/gh/zpcore/11/base 2025-12-04T10:32:19.4137384Z * [new branch] gh/zpcore/11/head -> origin/gh/zpcore/11/head 2025-12-04T10:32:19.4137451Z * [new branch] gh/zpcore/11/orig -> origin/gh/zpcore/11/orig 2025-12-04T10:32:19.4137516Z * [new branch] gh/zpcore/12/base -> origin/gh/zpcore/12/base 2025-12-04T10:32:19.4137583Z * [new branch] gh/zpcore/12/head -> origin/gh/zpcore/12/head 2025-12-04T10:32:19.4137647Z * [new branch] gh/zpcore/12/orig -> origin/gh/zpcore/12/orig 2025-12-04T10:32:19.4137712Z * [new branch] gh/zpcore/13/base -> origin/gh/zpcore/13/base 2025-12-04T10:32:19.4137807Z * [new branch] gh/zpcore/13/head -> origin/gh/zpcore/13/head 2025-12-04T10:32:19.4137873Z * [new branch] gh/zpcore/13/orig -> origin/gh/zpcore/13/orig 2025-12-04T10:32:19.4137937Z * [new branch] gh/zpcore/14/base -> origin/gh/zpcore/14/base 2025-12-04T10:32:19.4138003Z * [new branch] gh/zpcore/14/head -> origin/gh/zpcore/14/head 2025-12-04T10:32:19.4138069Z * [new branch] gh/zpcore/14/orig -> origin/gh/zpcore/14/orig 2025-12-04T10:32:19.4138134Z * [new branch] gh/zpcore/15/base -> origin/gh/zpcore/15/base 2025-12-04T10:32:19.4138200Z * [new branch] gh/zpcore/15/head -> origin/gh/zpcore/15/head 2025-12-04T10:32:19.4138266Z * [new branch] gh/zpcore/15/orig -> origin/gh/zpcore/15/orig 2025-12-04T10:32:19.4138331Z * [new branch] gh/zpcore/2/base -> origin/gh/zpcore/2/base 2025-12-04T10:32:19.4138399Z * [new branch] gh/zpcore/2/head -> origin/gh/zpcore/2/head 2025-12-04T10:32:19.4138464Z * [new branch] gh/zpcore/21/base -> origin/gh/zpcore/21/base 2025-12-04T10:32:19.4138530Z * [new branch] gh/zpcore/21/head -> origin/gh/zpcore/21/head 2025-12-04T10:32:19.4138596Z * [new branch] gh/zpcore/21/orig -> origin/gh/zpcore/21/orig 2025-12-04T10:32:19.4138686Z * [new branch] gh/zpcore/22/base -> origin/gh/zpcore/22/base 2025-12-04T10:32:19.4138751Z * [new branch] gh/zpcore/22/head -> origin/gh/zpcore/22/head 2025-12-04T10:32:19.4138818Z * [new branch] gh/zpcore/22/orig -> origin/gh/zpcore/22/orig 2025-12-04T10:32:19.4138883Z * [new branch] gh/zpcore/23/base -> origin/gh/zpcore/23/base 2025-12-04T10:32:19.4138948Z * [new branch] gh/zpcore/23/head -> origin/gh/zpcore/23/head 2025-12-04T10:32:19.4139015Z * [new branch] gh/zpcore/23/orig -> origin/gh/zpcore/23/orig 2025-12-04T10:32:19.4139080Z * [new branch] gh/zpcore/24/base -> origin/gh/zpcore/24/base 2025-12-04T10:32:19.4139145Z * [new branch] gh/zpcore/24/head -> origin/gh/zpcore/24/head 2025-12-04T10:32:19.4139210Z * [new branch] gh/zpcore/24/orig -> origin/gh/zpcore/24/orig 2025-12-04T10:32:19.4139278Z * [new branch] gh/zpcore/25/base -> origin/gh/zpcore/25/base 2025-12-04T10:32:19.4139344Z * [new branch] gh/zpcore/25/head -> origin/gh/zpcore/25/head 2025-12-04T10:32:19.4139409Z * [new branch] gh/zpcore/25/orig -> origin/gh/zpcore/25/orig 2025-12-04T10:32:19.4139473Z * [new branch] gh/zpcore/26/base -> origin/gh/zpcore/26/base 2025-12-04T10:32:19.4139540Z * [new branch] gh/zpcore/26/head -> origin/gh/zpcore/26/head 2025-12-04T10:32:19.4139629Z * [new branch] gh/zpcore/26/orig -> origin/gh/zpcore/26/orig 2025-12-04T10:32:19.4139696Z * [new branch] gh/zpcore/27/base -> origin/gh/zpcore/27/base 2025-12-04T10:32:19.4139763Z * [new branch] gh/zpcore/27/head -> origin/gh/zpcore/27/head 2025-12-04T10:32:19.4139828Z * [new branch] gh/zpcore/27/orig -> origin/gh/zpcore/27/orig 2025-12-04T10:32:19.4139895Z * [new branch] gh/zpcore/28/base -> origin/gh/zpcore/28/base 2025-12-04T10:32:19.4139962Z * [new branch] gh/zpcore/28/head -> origin/gh/zpcore/28/head 2025-12-04T10:32:19.4140027Z * [new branch] gh/zpcore/28/orig -> origin/gh/zpcore/28/orig 2025-12-04T10:32:19.4140094Z * [new branch] gh/zpcore/3/base -> origin/gh/zpcore/3/base 2025-12-04T10:32:19.4140160Z * [new branch] gh/zpcore/3/head -> origin/gh/zpcore/3/head 2025-12-04T10:32:19.4140225Z * [new branch] gh/zpcore/4/base -> origin/gh/zpcore/4/base 2025-12-04T10:32:19.4140333Z * [new branch] gh/zpcore/4/head -> origin/gh/zpcore/4/head 2025-12-04T10:32:19.4140400Z * [new branch] gh/zpcore/5/base -> origin/gh/zpcore/5/base 2025-12-04T10:32:19.4140464Z * [new branch] gh/zpcore/5/head -> origin/gh/zpcore/5/head 2025-12-04T10:32:19.4140533Z * [new branch] gh/zpcore/6/base -> origin/gh/zpcore/6/base 2025-12-04T10:32:19.4140597Z * [new branch] gh/zpcore/6/head -> origin/gh/zpcore/6/head 2025-12-04T10:32:19.4140662Z * [new branch] gh/zpcore/7/base -> origin/gh/zpcore/7/base 2025-12-04T10:32:19.4140727Z * [new branch] gh/zpcore/7/head -> origin/gh/zpcore/7/head 2025-12-04T10:32:19.4140792Z * [new branch] gh/zpcore/8/base -> origin/gh/zpcore/8/base 2025-12-04T10:32:19.4140857Z * [new branch] gh/zpcore/8/head -> origin/gh/zpcore/8/head 2025-12-04T10:32:19.4140926Z * [new branch] google-main -> origin/google-main 2025-12-04T10:32:19.4141011Z * [new branch] guangyey/external_stream -> origin/guangyey/external_stream 2025-12-04T10:32:19.4141082Z * [new branch] guangyey/test_2025 -> origin/guangyey/test_2025 2025-12-04T10:32:19.4141264Z * [new branch] guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9 2025-12-04T10:32:19.4141381Z * [new branch] hameerabbasi/complex_tensor_subclass -> origin/hameerabbasi/complex_tensor_subclass 2025-12-04T10:32:19.4141518Z * [new branch] hameerabbasi/fix-ctensor-gradcheck-tests -> origin/hameerabbasi/fix-ctensor-gradcheck-tests 2025-12-04T10:32:19.4141626Z * [new branch] hameerabbasi/gradcheck-allclose -> origin/hameerabbasi/gradcheck-allclose 2025-12-04T10:32:19.4141689Z * [new branch] hc_baseline -> origin/hc_baseline 2025-12-04T10:32:19.4141753Z * [new branch] hhh_rand -> origin/hhh_rand 2025-12-04T10:32:19.4141815Z * [new branch] huba/f1 -> origin/huba/f1 2025-12-04T10:32:19.4142005Z * [new branch] increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test -> origin/increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test 2025-12-04T10:32:19.4142070Z * [new branch] inlining -> origin/inlining 2025-12-04T10:32:19.4142139Z * [new branch] inlining-ezyang -> origin/inlining-ezyang 2025-12-04T10:32:19.4142221Z * [new branch] install-torchao-0.13.0 -> origin/install-torchao-0.13.0 2025-12-04T10:32:19.4142398Z * [new branch] instrument-trunk-pull-linux-with-job-test-filters -> origin/instrument-trunk-pull-linux-with-job-test-filters 2025-12-04T10:32:19.4142468Z * [new branch] invoke-subgraph -> origin/invoke-subgraph 2025-12-04T10:32:19.4142534Z * [new branch] issue#58739 -> origin/issue#58739 2025-12-04T10:32:19.4142614Z * [new branch] jainapurva-patch-1 -> origin/jainapurva-patch-1 2025-12-04T10:32:19.4142674Z * [new branch] jathu/o3 -> origin/jathu/o3 2025-12-04T10:32:19.4142733Z * [new branch] jathu/sve -> origin/jathu/sve 2025-12-04T10:32:19.4142859Z * [new branch] jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2 2025-12-04T10:32:19.4142962Z * [new branch] jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2 2025-12-04T10:32:19.4143072Z * [new branch] jiannanWang/memorysnapshot_filter -> origin/jiannanWang/memorysnapshot_filter 2025-12-04T10:32:19.4143181Z * [new branch] jiannanWang/profilerstepwarning -> origin/jiannanWang/profilerstepwarning 2025-12-04T10:32:19.4143267Z * [new branch] jithunnair-amd-patch-1 -> origin/jithunnair-amd-patch-1 2025-12-04T10:32:19.4143389Z * [new branch] jithunnair-amd-patch-10 -> origin/jithunnair-amd-patch-10 2025-12-04T10:32:19.4143471Z * [new branch] jithunnair-amd-patch-2 -> origin/jithunnair-amd-patch-2 2025-12-04T10:32:19.4143551Z * [new branch] jithunnair-amd-patch-3 -> origin/jithunnair-amd-patch-3 2025-12-04T10:32:19.4143631Z * [new branch] jithunnair-amd-patch-4 -> origin/jithunnair-amd-patch-4 2025-12-04T10:32:19.4143711Z * [new branch] jithunnair-amd-patch-5 -> origin/jithunnair-amd-patch-5 2025-12-04T10:32:19.4143790Z * [new branch] jithunnair-amd-patch-6 -> origin/jithunnair-amd-patch-6 2025-12-04T10:32:19.4143871Z * [new branch] jithunnair-amd-patch-7 -> origin/jithunnair-amd-patch-7 2025-12-04T10:32:19.4143947Z * [new branch] jithunnair-amd-patch-8 -> origin/jithunnair-amd-patch-8 2025-12-04T10:32:19.4144027Z * [new branch] jithunnair-amd-patch-9 -> origin/jithunnair-amd-patch-9 2025-12-04T10:32:19.4144106Z * [new branch] justinchu/native-qdq -> origin/justinchu/native-qdq 2025-12-04T10:32:19.4144177Z * [new branch] kainan666/xlf_debug -> origin/kainan666/xlf_debug 2025-12-04T10:32:19.4144240Z * [new branch] kainan_test -> origin/kainan_test 2025-12-04T10:32:19.4144352Z * [new branch] larryliu0820-patch-1 -> origin/larryliu0820-patch-1 2025-12-04T10:32:19.4144457Z * [new branch] leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues 2025-12-04T10:32:19.4144559Z * [new branch] lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error 2025-12-04T10:32:19.4144639Z * [new branch] liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce 2025-12-04T10:32:19.4144738Z * [new branch] liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax 2025-12-04T10:32:19.4144817Z * [new branch] liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa 2025-12-04T10:32:19.4144887Z * [new branch] llama4-stable -> origin/llama4-stable 2025-12-04T10:32:19.4144953Z * [new branch] lts/release/1.8 -> origin/lts/release/1.8 2025-12-04T10:32:19.4145027Z * [new branch] lucaskabela/#94773 -> origin/lucaskabela/#94773 2025-12-04T10:32:19.4145105Z * [new branch] lucaskabela/fix_164876 -> origin/lucaskabela/fix_164876 2025-12-04T10:32:19.4145188Z * [new branch] lucaskabela/flop_counter -> origin/lucaskabela/flop_counter 2025-12-04T10:32:19.4145285Z * [new branch] lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp 2025-12-04T10:32:19.4145390Z * [new branch] lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo 2025-12-04T10:32:19.4145515Z * [new branch] lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr 2025-12-04T10:32:19.4145627Z * [new branch] lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr 2025-12-04T10:32:19.4145760Z * [new branch] lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata 2025-12-04T10:32:19.4145840Z * [new branch] lucaskabela/rnn_decomp -> origin/lucaskabela/rnn_decomp 2025-12-04T10:32:19.4145932Z * [new branch] lucaskabela/typing_backends -> origin/lucaskabela/typing_backends 2025-12-04T10:32:19.4146029Z * [new branch] lucaskabela/typing_ctx_manager -> origin/lucaskabela/typing_ctx_manager 2025-12-04T10:32:19.4146123Z * [new branch] lucaskabela/typing_nn_module -> origin/lucaskabela/typing_nn_module 2025-12-04T10:32:19.4146225Z * [new branch] lucaskabela/typing_user_defined -> origin/lucaskabela/typing_user_defined 2025-12-04T10:32:19.4146345Z * [new branch] lucaskabela/typing_variables -> origin/lucaskabela/typing_variables 2025-12-04T10:32:19.4146453Z * [new branch] lucaskabela/typing_variables_dicts -> origin/lucaskabela/typing_variables_dicts 2025-12-04T10:32:19.4146574Z * [new branch] lucaskabela/typing_variables_functions -> origin/lucaskabela/typing_variables_functions 2025-12-04T10:32:19.4146682Z * [new branch] lucaskabela/typing_variables_lists -> origin/lucaskabela/typing_variables_lists 2025-12-04T10:32:19.4146752Z * [new branch] lw/torch_box_by_ref -> origin/lw/torch_box_by_ref 2025-12-04T10:32:19.4146813Z * [new branch] main -> origin/main 2025-12-04T10:32:19.4146882Z * [new branch] malfet-patch-1 -> origin/malfet-patch-1 2025-12-04T10:32:19.4146951Z * [new branch] malfet-patch-2 -> origin/malfet-patch-2 2025-12-04T10:32:19.4147018Z * [new branch] malfet-patch-3 -> origin/malfet-patch-3 2025-12-04T10:32:19.4147082Z * [new branch] malfet-patch-4 -> origin/malfet-patch-4 2025-12-04T10:32:19.4147146Z * [new branch] malfet-patch-5 -> origin/malfet-patch-5 2025-12-04T10:32:19.4147211Z * [new branch] malfet-patch-6 -> origin/malfet-patch-6 2025-12-04T10:32:19.4147315Z * [new branch] malfet-patch-7 -> origin/malfet-patch-7 2025-12-04T10:32:19.4147382Z * [new branch] malfet-patch-8 -> origin/malfet-patch-8 2025-12-04T10:32:19.4147456Z * [new branch] malfet/add-3.14-ci -> origin/malfet/add-3.14-ci 2025-12-04T10:32:19.4147616Z * [new branch] malfet/be-do-not-make-typos-in-build-artifacts -> origin/malfet/be-do-not-make-typos-in-build-artifacts 2025-12-04T10:32:19.4147783Z * [new branch] malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch 2025-12-04T10:32:19.4147911Z * [new branch] malfet/be-remove-misisng-neon-headers -> origin/malfet/be-remove-misisng-neon-headers 2025-12-04T10:32:19.4148009Z * [new branch] malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im 2025-12-04T10:32:19.4148128Z * [new branch] manuel/aoti_metal_shimify-thread_safe -> origin/manuel/aoti_metal_shimify-thread_safe 2025-12-04T10:32:19.4148218Z * [new branch] manuel/inductor_link_openmp -> origin/manuel/inductor_link_openmp 2025-12-04T10:32:19.4148292Z * [new branch] masnesral/metaconda -> origin/masnesral/metaconda 2025-12-04T10:32:19.4148369Z * [new branch] mem_profiler_flaky_fix -> origin/mem_profiler_flaky_fix 2025-12-04T10:32:19.4148448Z * [new branch] mem_profiler_stack_trace -> origin/mem_profiler_stack_trace 2025-12-04T10:32:19.4148524Z * [new branch] memory_profiler_stack -> origin/memory_profiler_stack 2025-12-04T10:32:19.4148599Z * [new branch] metascroy-patch-1 -> origin/metascroy-patch-1 2025-12-04T10:32:19.4148662Z * [new branch] mingw_posix -> origin/mingw_posix 2025-12-04T10:32:19.4148736Z * [new branch] mlazos/S429861-debug -> origin/mlazos/S429861-debug 2025-12-04T10:32:19.4148797Z * [new branch] mlazos/aa -> origin/mlazos/aa 2025-12-04T10:32:19.4148859Z * [new branch] mlazos/acts -> origin/mlazos/acts 2025-12-04T10:32:19.4148932Z * [new branch] mlazos/arg-renames -> origin/mlazos/arg-renames 2025-12-04T10:32:19.4149009Z * [new branch] mlazos/bad-cudagraphs -> origin/mlazos/bad-cudagraphs 2025-12-04T10:32:19.4149108Z * [new branch] mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks 2025-12-04T10:32:19.4149181Z * [new branch] mlazos/beta-tensor -> origin/mlazos/beta-tensor 2025-12-04T10:32:19.4149271Z * [new branch] mlazos/buffers -> origin/mlazos/buffers 2025-12-04T10:32:19.4149337Z * [new branch] mlazos/buffers2 -> origin/mlazos/buffers2 2025-12-04T10:32:19.4149403Z * [new branch] mlazos/buffers3 -> origin/mlazos/buffers3 2025-12-04T10:32:19.4149465Z * [new branch] mlazos/bwd -> origin/mlazos/bwd 2025-12-04T10:32:19.4149534Z * [new branch] mlazos/combo-test -> origin/mlazos/combo-test 2025-12-04T10:32:19.4149644Z * [new branch] mlazos/ctx-cleanup -> origin/mlazos/ctx-cleanup 2025-12-04T10:32:19.4149719Z * [new branch] mlazos/cuda-cmd-log -> origin/mlazos/cuda-cmd-log 2025-12-04T10:32:19.4149798Z * [new branch] mlazos/cudagraph-tests -> origin/mlazos/cudagraph-tests 2025-12-04T10:32:19.4149900Z * [new branch] mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement 2025-12-04T10:32:19.4149975Z * [new branch] mlazos/cutlass-test -> origin/mlazos/cutlass-test 2025-12-04T10:32:19.4150055Z * [new branch] mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug 2025-12-04T10:32:19.4150137Z * [new branch] mlazos/dataclass-proxy -> origin/mlazos/dataclass-proxy 2025-12-04T10:32:19.4150247Z * [new branch] mlazos/dc-attrs -> origin/mlazos/dc-attrs 2025-12-04T10:32:19.4150319Z * [new branch] mlazos/dc-helion -> origin/mlazos/dc-helion 2025-12-04T10:32:19.4150387Z * [new branch] mlazos/dict-fix -> origin/mlazos/dict-fix 2025-12-04T10:32:19.4150456Z * [new branch] mlazos/disable-tf -> origin/mlazos/disable-tf 2025-12-04T10:32:19.4150524Z * [new branch] mlazos/dupe-fix -> origin/mlazos/dupe-fix 2025-12-04T10:32:19.4150592Z * [new branch] mlazos/dyn-batch -> origin/mlazos/dyn-batch 2025-12-04T10:32:19.4150656Z * [new branch] mlazos/evt -> origin/mlazos/evt 2025-12-04T10:32:19.4150737Z * [new branch] mlazos/extract-examples -> origin/mlazos/extract-examples 2025-12-04T10:32:19.4150807Z * [new branch] mlazos/foreach-op -> origin/mlazos/foreach-op 2025-12-04T10:32:19.4150869Z * [new branch] mlazos/fp8 -> origin/mlazos/fp8 2025-12-04T10:32:19.4150936Z * [new branch] mlazos/fp8-bias -> origin/mlazos/fp8-bias 2025-12-04T10:32:19.4151014Z * [new branch] mlazos/fp8-bias-fusion -> origin/mlazos/fp8-bias-fusion 2025-12-04T10:32:19.4151081Z * [new branch] mlazos/fp8-fixes -> origin/mlazos/fp8-fixes 2025-12-04T10:32:19.4151148Z * [new branch] mlazos/freezing -> origin/mlazos/freezing 2025-12-04T10:32:19.4151214Z * [new branch] mlazos/h-comp -> origin/mlazos/h-comp 2025-12-04T10:32:19.4151281Z * [new branch] mlazos/h-comp2 -> origin/mlazos/h-comp2 2025-12-04T10:32:19.4151349Z * [new branch] mlazos/hash-hop -> origin/mlazos/hash-hop 2025-12-04T10:32:19.4151410Z * [new branch] mlazos/hc -> origin/mlazos/hc 2025-12-04T10:32:19.4151476Z * [new branch] mlazos/hc-cycles -> origin/mlazos/hc-cycles 2025-12-04T10:32:19.4151546Z * [new branch] mlazos/hc-fixes -> origin/mlazos/hc-fixes 2025-12-04T10:32:19.4151614Z * [new branch] mlazos/hc-fixes3 -> origin/mlazos/hc-fixes3 2025-12-04T10:32:19.4151681Z * [new branch] mlazos/hc-fixes4 -> origin/mlazos/hc-fixes4 2025-12-04T10:32:19.4151745Z * [new branch] mlazos/hc-hf -> origin/mlazos/hc-hf 2025-12-04T10:32:19.4151809Z * [new branch] mlazos/hc-mut -> origin/mlazos/hc-mut 2025-12-04T10:32:19.4151909Z * [new branch] mlazos/hc10 -> origin/mlazos/hc10 2025-12-04T10:32:19.4151973Z * [new branch] mlazos/hc11 -> origin/mlazos/hc11 2025-12-04T10:32:19.4152032Z * [new branch] mlazos/hc12 -> origin/mlazos/hc12 2025-12-04T10:32:19.4152093Z * [new branch] mlazos/hc13 -> origin/mlazos/hc13 2025-12-04T10:32:19.4152155Z * [new branch] mlazos/hc14 -> origin/mlazos/hc14 2025-12-04T10:32:19.4152215Z * [new branch] mlazos/hc15 -> origin/mlazos/hc15 2025-12-04T10:32:19.4152275Z * [new branch] mlazos/hc2 -> origin/mlazos/hc2 2025-12-04T10:32:19.4152336Z * [new branch] mlazos/hc4 -> origin/mlazos/hc4 2025-12-04T10:32:19.4152395Z * [new branch] mlazos/hc5 -> origin/mlazos/hc5 2025-12-04T10:32:19.4152456Z * [new branch] mlazos/hc6 -> origin/mlazos/hc6 2025-12-04T10:32:19.4152516Z * [new branch] mlazos/hc7 -> origin/mlazos/hc7 2025-12-04T10:32:19.4152573Z * [new branch] mlazos/hc8 -> origin/mlazos/hc8 2025-12-04T10:32:19.4152633Z * [new branch] mlazos/hc9 -> origin/mlazos/hc9 2025-12-04T10:32:19.4152704Z * [new branch] mlazos/hc_baseline2 -> origin/mlazos/hc_baseline2 2025-12-04T10:32:19.4152827Z * [new branch] mlazos/inductor-streams -> origin/mlazos/inductor-streams 2025-12-04T10:32:19.4152890Z * [new branch] mlazos/main -> origin/mlazos/main 2025-12-04T10:32:19.4152950Z * [new branch] mlazos/mcg2 -> origin/mlazos/mcg2 2025-12-04T10:32:19.4153023Z * [new branch] mlazos/meta-guards -> origin/mlazos/meta-guards 2025-12-04T10:32:19.4153129Z * [new branch] mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam 2025-12-04T10:32:19.4153226Z * [new branch] mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup 2025-12-04T10:32:19.4153292Z * [new branch] mlazos/mod-fix -> origin/mlazos/mod-fix 2025-12-04T10:32:19.4153361Z * [new branch] mlazos/mode-fix -> origin/mlazos/mode-fix 2025-12-04T10:32:19.4153425Z * [new branch] mlazos/offsets -> origin/mlazos/offsets 2025-12-04T10:32:19.4153499Z * [new branch] mlazos/overguarding -> origin/mlazos/overguarding 2025-12-04T10:32:19.4153575Z * [new branch] mlazos/proxy-ctors -> origin/mlazos/proxy-ctors 2025-12-04T10:32:19.4153643Z * [new branch] mlazos/quant-fix -> origin/mlazos/quant-fix 2025-12-04T10:32:19.4153716Z * [new branch] mlazos/resnet-fix -> origin/mlazos/resnet-fix 2025-12-04T10:32:19.4153789Z * [new branch] mlazos/rm-buf-names -> origin/mlazos/rm-buf-names 2025-12-04T10:32:19.4153855Z * [new branch] mlazos/rm-code -> origin/mlazos/rm-code 2025-12-04T10:32:19.4153919Z * [new branch] mlazos/rm-spam -> origin/mlazos/rm-spam 2025-12-04T10:32:19.4153983Z * [new branch] mlazos/rtp -> origin/mlazos/rtp 2025-12-04T10:32:19.4154065Z * [new branch] mlazos/static-idx-dbg -> origin/mlazos/static-idx-dbg 2025-12-04T10:32:19.4154153Z * [new branch] mlazos/static-inputs-log -> origin/mlazos/static-inputs-log 2025-12-04T10:32:19.4154216Z * [new branch] mlazos/stests -> origin/mlazos/stests 2025-12-04T10:32:19.4154284Z * [new branch] mlazos/stream-ops -> origin/mlazos/stream-ops 2025-12-04T10:32:19.4154350Z * [new branch] mlazos/td-fix2 -> origin/mlazos/td-fix2 2025-12-04T10:32:19.4154427Z * [new branch] mlazos/tensor-hasattr2 -> origin/mlazos/tensor-hasattr2 2025-12-04T10:32:19.4154488Z * [new branch] mlazos/test -> origin/mlazos/test 2025-12-04T10:32:19.4154582Z * [new branch] mlazos/tf-mode -> origin/mlazos/tf-mode 2025-12-04T10:32:19.4154660Z * [new branch] mlazos/tf-mode-backup2 -> origin/mlazos/tf-mode-backup2 2025-12-04T10:32:19.4154735Z * [new branch] mlazos/tf-mode-reland -> origin/mlazos/tf-mode-reland 2025-12-04T10:32:19.4154814Z * [new branch] mlazos/tf-mode-reland2 -> origin/mlazos/tf-mode-reland2 2025-12-04T10:32:19.4154889Z * [new branch] mlazos/tf-mode-reland3 -> origin/mlazos/tf-mode-reland3 2025-12-04T10:32:19.4154965Z * [new branch] mlazos/triton-no-epi -> origin/mlazos/triton-no-epi 2025-12-04T10:32:19.4155039Z * [new branch] mlazos/tune-proto -> origin/mlazos/tune-proto 2025-12-04T10:32:19.4155111Z * [new branch] mlazos/tuple-fixes -> origin/mlazos/tuple-fixes 2025-12-04T10:32:19.4155183Z * [new branch] mlazos/tuple-fixes2 -> origin/mlazos/tuple-fixes2 2025-12-04T10:32:19.4155262Z * [new branch] mlazos/tuple-handling -> origin/mlazos/tuple-handling 2025-12-04T10:32:19.4155343Z * [new branch] mlazos/user-stream-base -> origin/mlazos/user-stream-base 2025-12-04T10:32:19.4155415Z * [new branch] mlazos/user-streams -> origin/mlazos/user-streams 2025-12-04T10:32:19.4155538Z * [new branch] mlazos/user-streams-backup -> origin/mlazos/user-streams-backup 2025-12-04T10:32:19.4155633Z * [new branch] mlazos/user-streams-backup2 -> origin/mlazos/user-streams-backup2 2025-12-04T10:32:19.4155704Z * [new branch] mlazos/vary-beta -> origin/mlazos/vary-beta 2025-12-04T10:32:19.4155773Z * [new branch] mlazos/vary-beta2 -> origin/mlazos/vary-beta2 2025-12-04T10:32:19.4155846Z * [new branch] mlazos/weird-perf1 -> origin/mlazos/weird-perf1 2025-12-04T10:32:19.4155918Z * [new branch] mm_out_dtype_compile -> origin/mm_out_dtype_compile 2025-12-04T10:32:19.4155982Z * [new branch] module-shim -> origin/module-shim 2025-12-04T10:32:19.4156043Z * [new branch] move_config -> origin/move_config 2025-12-04T10:32:19.4156113Z * [new branch] msaroufim/reduce -> origin/msaroufim/reduce 2025-12-04T10:32:19.4156184Z * [new branch] mtia/basic-cmake -> origin/mtia/basic-cmake 2025-12-04T10:32:19.4156285Z * [new branch] mwizak/fix-triton-block-shape -> origin/mwizak/fix-triton-block-shape 2025-12-04T10:32:19.4156353Z * [new branch] my_varlen_backup -> origin/my_varlen_backup 2025-12-04T10:32:19.4156427Z * [new branch] nativert_num_outputs -> origin/nativert_num_outputs 2025-12-04T10:32:19.4156489Z * [new branch] new-codegen -> origin/new-codegen 2025-12-04T10:32:19.4156557Z * [new branch] newtest-base -> origin/newtest-base 2025-12-04T10:32:19.4156628Z * [new branch] ngimel/addmm_dtype -> origin/ngimel/addmm_dtype 2025-12-04T10:32:19.4156692Z * [new branch] ngimel/div_inv -> origin/ngimel/div_inv 2025-12-04T10:32:19.4156770Z * [new branch] ngimel/error_index_list -> origin/ngimel/error_index_list 2025-12-04T10:32:19.4156841Z * [new branch] ngimel/gather_grid -> origin/ngimel/gather_grid 2025-12-04T10:32:19.4156928Z * [new branch] ngimel/gather_grid_release -> origin/ngimel/gather_grid_release 2025-12-04T10:32:19.4156992Z * [new branch] ngimel/gg_new -> origin/ngimel/gg_new 2025-12-04T10:32:19.4157059Z * [new branch] ngimel/hostalloc -> origin/ngimel/hostalloc 2025-12-04T10:32:19.4157128Z * [new branch] ngimel/storage_id -> origin/ngimel/storage_id 2025-12-04T10:32:19.4157193Z * [new branch] nightly -> origin/nightly 2025-12-04T10:32:19.4157341Z * [new branch] nikitaved/addmm_1_rowcol_lt_path_check -> origin/nikitaved/addmm_1_rowcol_lt_path_check 2025-12-04T10:32:19.4157463Z * [new branch] nikitaved/addmm_epilogue_fusions_2d_bias -> origin/nikitaved/addmm_epilogue_fusions_2d_bias 2025-12-04T10:32:19.4157589Z * [new branch] nikitaved/addmm_epilogue_fusions_inductor -> origin/nikitaved/addmm_epilogue_fusions_inductor 2025-12-04T10:32:19.4157710Z * [new branch] nikitaved/addmm_epilogue_fusions_scratch -> origin/nikitaved/addmm_epilogue_fusions_scratch 2025-12-04T10:32:19.4157826Z * [new branch] nikitaved/grad_addmm_epilogue_fusions -> origin/nikitaved/grad_addmm_epilogue_fusions 2025-12-04T10:32:19.4157935Z * [new branch] nikitaved/simpler_can_use_32bit_index -> origin/nikitaved/simpler_can_use_32bit_index 2025-12-04T10:32:19.4158002Z * [new branch] nikitaved/test -> origin/nikitaved/test 2025-12-04T10:32:19.4158129Z * [new branch] nmacchioni-perf-test-async-autotune -> origin/nmacchioni-perf-test-async-autotune 2025-12-04T10:32:19.4158205Z * [new branch] no_distributed_log_spew -> origin/no_distributed_log_spew 2025-12-04T10:32:19.4158268Z * [new branch] nofun-hack -> origin/nofun-hack 2025-12-04T10:32:19.4158359Z * [new branch] norm_bench -> origin/norm_bench 2025-12-04T10:32:19.4158434Z * [new branch] nullplay/fuse_matmul -> origin/nullplay/fuse_matmul 2025-12-04T10:32:19.4158507Z * [new branch] nullplay_fuse_matmul -> origin/nullplay_fuse_matmul 2025-12-04T10:32:19.4158575Z * [new branch] optimizer_test -> origin/optimizer_test 2025-12-04T10:32:19.4158643Z * [new branch] orig/release/1.10 -> origin/orig/release/1.10 2025-12-04T10:32:19.4158710Z * [new branch] orig/release/1.11 -> origin/orig/release/1.11 2025-12-04T10:32:19.4158779Z * [new branch] orig/release/1.12 -> origin/orig/release/1.12 2025-12-04T10:32:19.4158845Z * [new branch] orig/release/1.13 -> origin/orig/release/1.13 2025-12-04T10:32:19.4158913Z * [new branch] orig/release/1.6 -> origin/orig/release/1.6 2025-12-04T10:32:19.4158979Z * [new branch] orig/release/1.7 -> origin/orig/release/1.7 2025-12-04T10:32:19.4159043Z * [new branch] orig/release/1.8 -> origin/orig/release/1.8 2025-12-04T10:32:19.4159110Z * [new branch] orig/release/1.9 -> origin/orig/release/1.9 2025-12-04T10:32:19.4159174Z * [new branch] orig/release/2.0 -> origin/orig/release/2.0 2025-12-04T10:32:19.4159240Z * [new branch] orig/release/2.1 -> origin/orig/release/2.1 2025-12-04T10:32:19.4159308Z * [new branch] orig/release/2.2 -> origin/orig/release/2.2 2025-12-04T10:32:19.4159373Z * [new branch] orig/release/2.3 -> origin/orig/release/2.3 2025-12-04T10:32:19.4159438Z * [new branch] orig/release/2.4 -> origin/orig/release/2.4 2025-12-04T10:32:19.4159505Z * [new branch] orig/release/2.5 -> origin/orig/release/2.5 2025-12-04T10:32:19.4159614Z * [new branch] orig/release/2.6 -> origin/orig/release/2.6 2025-12-04T10:32:19.4159679Z * [new branch] orig/release/2.7 -> origin/orig/release/2.7 2025-12-04T10:32:19.4159744Z * [new branch] orig/release/2.8 -> origin/orig/release/2.8 2025-12-04T10:32:19.4159808Z * [new branch] orig/release/2.9 -> origin/orig/release/2.9 2025-12-04T10:32:19.4159893Z * [new branch] origin/gh/fxdawnn/1/base -> origin/origin/gh/fxdawnn/1/base 2025-12-04T10:32:19.4159975Z * [new branch] origin/gh/fxdawnn/1/orig -> origin/origin/gh/fxdawnn/1/orig 2025-12-04T10:32:19.4160103Z * [new branch] origin/gh/zpcore/14/orig -> origin/origin/gh/zpcore/14/orig 2025-12-04T10:32:19.4160170Z * [new branch] oulgen-patch-1 -> origin/oulgen-patch-1 2025-12-04T10:32:19.4160238Z * [new branch] oulgen-patch-2 -> origin/oulgen-patch-2 2025-12-04T10:32:19.4160305Z * [new branch] oulgen-patch-3 -> origin/oulgen-patch-3 2025-12-04T10:32:19.4160371Z * [new branch] oulgen-patch-4 -> origin/oulgen-patch-4 2025-12-04T10:32:19.4160439Z * [new branch] padded-tensor -> origin/padded-tensor 2025-12-04T10:32:19.4160501Z * [new branch] pca2 -> origin/pca2 2025-12-04T10:32:19.4160573Z * [new branch] per_channel_backup -> origin/per_channel_backup 2025-12-04T10:32:19.4160636Z * [new branch] perf_ops -> origin/perf_ops 2025-12-04T10:32:19.4160701Z * [new branch] perf_ops_2_9 -> origin/perf_ops_2_9 2025-12-04T10:32:19.4160772Z * [new branch] pianpwk-patch-1 -> origin/pianpwk-patch-1 2025-12-04T10:32:19.4160857Z * [new branch] pianpwk/__draft_debug_mode -> origin/pianpwk/__draft_debug_mode 2025-12-04T10:32:19.4161009Z * [new branch] pianpwk/_debug_mode_for_triton_draft -> origin/pianpwk/_debug_mode_for_triton_draft 2025-12-04T10:32:19.4161111Z * [new branch] pianpwk/_debug_nn_module_compile -> origin/pianpwk/_debug_nn_module_compile 2025-12-04T10:32:19.4161195Z * [new branch] pianpwk/_draft_triton_11_3 -> origin/pianpwk/_draft_triton_11_3 2025-12-04T10:32:19.4161285Z * [new branch] pianpwk/_manual_bucket_draft -> origin/pianpwk/_manual_bucket_draft 2025-12-04T10:32:19.4161388Z * [new branch] pianpwk/_profile_w_dispatch_keys -> origin/pianpwk/_profile_w_dispatch_keys 2025-12-04T10:32:19.4161486Z * [new branch] pianpwk/_super_draft_debug_mode -> origin/pianpwk/_super_draft_debug_mode 2025-12-04T10:32:19.4161589Z * [new branch] pianpwk/_unbacked_local_shard_size -> origin/pianpwk/_unbacked_local_shard_size 2025-12-04T10:32:19.4161663Z * [new branch] pianpwk/anomaly_tb -> origin/pianpwk/anomaly_tb 2025-12-04T10:32:19.4161745Z * [new branch] pianpwk/auto_fx_annotate -> origin/pianpwk/auto_fx_annotate 2025-12-04T10:32:19.4161857Z * [new branch] pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export 2025-12-04T10:32:19.4161942Z * [new branch] pianpwk/bert_dynamic_perf -> origin/pianpwk/bert_dynamic_perf 2025-12-04T10:32:19.4162038Z * [new branch] pianpwk/debug_fwd_stack_traces -> origin/pianpwk/debug_fwd_stack_traces 2025-12-04T10:32:19.4162123Z * [new branch] pianpwk/debug_hash_tensor -> origin/pianpwk/debug_hash_tensor 2025-12-04T10:32:19.4162214Z * [new branch] pianpwk/debug_mode_annotate -> origin/pianpwk/debug_mode_annotate 2025-12-04T10:32:19.4162301Z * [new branch] pianpwk/debug_mode_defaults -> origin/pianpwk/debug_mode_defaults 2025-12-04T10:32:19.4162381Z * [new branch] pianpwk/debug_mode_hacks -> origin/pianpwk/debug_mode_hacks 2025-12-04T10:32:19.4162487Z * [new branch] pianpwk/debug_mode_opcall_refactor -> origin/pianpwk/debug_mode_opcall_refactor 2025-12-04T10:32:19.4162572Z * [new branch] pianpwk/debug_mode_show_ids -> origin/pianpwk/debug_mode_show_ids 2025-12-04T10:32:19.4162655Z * [new branch] pianpwk/debug_mode_triton -> origin/pianpwk/debug_mode_triton 2025-12-04T10:32:19.4162748Z * [new branch] pianpwk/debug_show_stack_trace -> origin/pianpwk/debug_show_stack_trace 2025-12-04T10:32:19.4162846Z * [new branch] pianpwk/debug_wait_on_collective -> origin/pianpwk/debug_wait_on_collective 2025-12-04T10:32:19.4162971Z * [new branch] pianpwk/debugmode_compile_tf -> origin/pianpwk/debugmode_compile_tf 2025-12-04T10:32:19.4163096Z * [new branch] pianpwk/dispatch_key_debugging_for_debug -> origin/pianpwk/dispatch_key_debugging_for_debug 2025-12-04T10:32:19.4163200Z * [new branch] pianpwk/draft_debug_mode_tfcompile -> origin/pianpwk/draft_debug_mode_tfcompile 2025-12-04T10:32:19.4163297Z * [new branch] pianpwk/draft_multikernel_nn -> origin/pianpwk/draft_multikernel_nn 2025-12-04T10:32:19.4163410Z * [new branch] pianpwk/draft_multikernel_status_10_5 -> origin/pianpwk/draft_multikernel_status_10_5 2025-12-04T10:32:19.4163502Z * [new branch] pianpwk/dtensor_custom_chunk -> origin/pianpwk/dtensor_custom_chunk 2025-12-04T10:32:19.4163605Z * [new branch] pianpwk/dtensor_unbacked_keypath -> origin/pianpwk/dtensor_unbacked_keypath 2025-12-04T10:32:19.4163686Z * [new branch] pianpwk/event_list_tree -> origin/pianpwk/event_list_tree 2025-12-04T10:32:19.4163771Z * [new branch] pianpwk/false_numel_refs -> origin/pianpwk/false_numel_refs 2025-12-04T10:32:19.4163847Z * [new branch] pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel 2025-12-04T10:32:19.4163948Z * [new branch] pianpwk/multikernel_hints_draft -> origin/pianpwk/multikernel_hints_draft 2025-12-04T10:32:19.4164098Z * [new branch] pianpwk/no_size_oblivious_slice_scat -> origin/pianpwk/no_size_oblivious_slice_scat 2025-12-04T10:32:19.4164211Z * [new branch] pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better 2025-12-04T10:32:19.4164292Z * [new branch] pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook 2025-12-04T10:32:19.4164399Z * [new branch] pianpwk/skip_python_keys_alternate -> origin/pianpwk/skip_python_keys_alternate 2025-12-04T10:32:19.4164502Z * [new branch] pianpwk/skip_python_keys_in_guards -> origin/pianpwk/skip_python_keys_in_guards 2025-12-04T10:32:19.4164585Z * [new branch] pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft 2025-12-04T10:32:19.4164665Z * [new branch] pianpwk/symint_one_hot -> origin/pianpwk/symint_one_hot 2025-12-04T10:32:19.4164777Z * [new branch] pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false 2025-12-04T10:32:19.4164875Z * [new branch] pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap 2025-12-04T10:32:19.4164961Z * [new branch] pianpwk/try_dumb_stuff -> origin/pianpwk/try_dumb_stuff 2025-12-04T10:32:19.4165040Z * [new branch] pianpwk/try_dumb_stuff_2 -> origin/pianpwk/try_dumb_stuff_2 2025-12-04T10:32:19.4165133Z * [new branch] pianpwk/unbacked_dtensor_mm -> origin/pianpwk/unbacked_dtensor_mm 2025-12-04T10:32:19.4165227Z * [new branch] pianpwk/unbacked_tracing_12_2 -> origin/pianpwk/unbacked_tracing_12_2 2025-12-04T10:32:19.4165305Z * [new branch] pianpwk/user_symints -> origin/pianpwk/user_symints 2025-12-04T10:32:19.4165385Z * [new branch] pianpwk/wan21_reshape -> origin/pianpwk/wan21_reshape 2025-12-04T10:32:19.4165477Z * [new branch] piz/fix_partial_backward_1112 -> origin/piz/fix_partial_backward_1112 2025-12-04T10:32:19.4165551Z * [new branch] piz/prop_cache_clean -> origin/piz/prop_cache_clean 2025-12-04T10:32:19.4165622Z * [new branch] pool-separate -> origin/pool-separate 2025-12-04T10:32:19.4165682Z * [new branch] pr-156087 -> origin/pr-156087 2025-12-04T10:32:19.4165743Z * [new branch] pr/131860 -> origin/pr/131860 2025-12-04T10:32:19.4165811Z * [new branch] predispatch_to -> origin/predispatch_to 2025-12-04T10:32:19.4165875Z * [new branch] protect-c17 -> origin/protect-c17 2025-12-04T10:32:19.4165969Z * [new branch] pt-opt-cuda3 -> origin/pt-opt-cuda3 2025-12-04T10:32:19.4166052Z * [new branch] python_compiled_autograd -> origin/python_compiled_autograd 2025-12-04T10:32:19.4166181Z * [new branch] q1l1/fix_device_moved_constant_type_unknown -> origin/q1l1/fix_device_moved_constant_type_unknown 2025-12-04T10:32:19.4166320Z * [new branch] q1l1/fix_wrong_default_type_for_kernel_call_args -> origin/q1l1/fix_wrong_default_type_for_kernel_call_args 2025-12-04T10:32:19.4166400Z * [new branch] qchip/export-D54134695 -> origin/qchip/export-D54134695 2025-12-04T10:32:19.4166472Z * [new branch] quote-pytest_cache -> origin/quote-pytest_cache 2025-12-04T10:32:19.4166569Z * [new branch] reland-accgrad-stream-warn -> origin/reland-accgrad-stream-warn 2025-12-04T10:32:19.4166633Z * [new branch] release/1.10 -> origin/release/1.10 2025-12-04T10:32:19.4166698Z * [new branch] release/1.11 -> origin/release/1.11 2025-12-04T10:32:19.4166760Z * [new branch] release/1.12 -> origin/release/1.12 2025-12-04T10:32:19.4166822Z * [new branch] release/1.13 -> origin/release/1.13 2025-12-04T10:32:19.4166882Z * [new branch] release/1.4 -> origin/release/1.4 2025-12-04T10:32:19.4166973Z * [new branch] release/1.4.1 -> origin/release/1.4.1 2025-12-04T10:32:19.4167034Z * [new branch] release/1.5 -> origin/release/1.5 2025-12-04T10:32:19.4167094Z * [new branch] release/1.6 -> origin/release/1.6 2025-12-04T10:32:19.4167153Z * [new branch] release/1.7 -> origin/release/1.7 2025-12-04T10:32:19.4167212Z * [new branch] release/1.8 -> origin/release/1.8 2025-12-04T10:32:19.4167271Z * [new branch] release/1.9 -> origin/release/1.9 2025-12-04T10:32:19.4167334Z * [new branch] release/2.0 -> origin/release/2.0 2025-12-04T10:32:19.4167393Z * [new branch] release/2.1 -> origin/release/2.1 2025-12-04T10:32:19.4167451Z * [new branch] release/2.2 -> origin/release/2.2 2025-12-04T10:32:19.4167512Z * [new branch] release/2.3 -> origin/release/2.3 2025-12-04T10:32:19.4167572Z * [new branch] release/2.4 -> origin/release/2.4 2025-12-04T10:32:19.4167632Z * [new branch] release/2.5 -> origin/release/2.5 2025-12-04T10:32:19.4167692Z * [new branch] release/2.6 -> origin/release/2.6 2025-12-04T10:32:19.4167751Z * [new branch] release/2.7 -> origin/release/2.7 2025-12-04T10:32:19.4167811Z * [new branch] release/2.8 -> origin/release/2.8 2025-12-04T10:32:19.4167873Z * [new branch] release/2.9 -> origin/release/2.9 2025-12-04T10:32:19.4167935Z * [new branch] release_notes -> origin/release_notes 2025-12-04T10:32:19.4168009Z * [new branch] remove_pyinterpreter -> origin/remove_pyinterpreter 2025-12-04T10:32:19.4168133Z * [new branch] replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836 2025-12-04T10:32:19.4168255Z * [new branch] replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248 2025-12-04T10:32:19.4168373Z * [new branch] replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324 2025-12-04T10:32:19.4168489Z * [new branch] replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020 2025-12-04T10:32:19.4168617Z * [new branch] revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head 2025-12-04T10:32:19.4168758Z * [new branch] revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head 2025-12-04T10:32:19.4168859Z * [new branch] revert-152361-gh/fadara01/1/head -> origin/revert-152361-gh/fadara01/1/head 2025-12-04T10:32:19.4168960Z * [new branch] revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head 2025-12-04T10:32:19.4169132Z * [new branch] revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ 2025-12-04T10:32:19.4169227Z * [new branch] revert-hoo-invoke-subgraph -> origin/revert-hoo-invoke-subgraph 2025-12-04T10:32:19.4169323Z * [new branch] revert_always_build_distributed -> origin/revert_always_build_distributed 2025-12-04T10:32:19.4169390Z * [new branch] rms_norm_patch -> origin/rms_norm_patch 2025-12-04T10:32:19.4169484Z * [new branch] ruisi/fix_all_to_all_estimation -> origin/ruisi/fix_all_to_all_estimation 2025-12-04T10:32:19.4169608Z * [new branch] ruisi/fix_comm_estimation -> origin/ruisi/fix_comm_estimation 2025-12-04T10:32:19.4169716Z * [new branch] ruisi/fix_dynamic_shape_estimation -> origin/ruisi/fix_dynamic_shape_estimation 2025-12-04T10:32:19.4169866Z * [new branch] ruisi/fix_llama3_autobucketing -> origin/ruisi/fix_llama3_autobucketing 2025-12-04T10:32:19.4169970Z * [new branch] ruisi/fix_manual_bucketing_ep_pass -> origin/ruisi/fix_manual_bucketing_ep_pass 2025-12-04T10:32:19.4170052Z * [new branch] ruisi/manual_bucket_pass -> origin/ruisi/manual_bucket_pass 2025-12-04T10:32:19.4170196Z * [new branch] ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures 2025-12-04T10:32:19.4170283Z * [new branch] ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var 2025-12-04T10:32:19.4170360Z * [new branch] rzou/faketensor_bench -> origin/rzou/faketensor_bench 2025-12-04T10:32:19.4170422Z * [new branch] rzou/njt -> origin/rzou/njt 2025-12-04T10:32:19.4170484Z * [new branch] rzou/pca -> origin/rzou/pca 2025-12-04T10:32:19.4170548Z * [new branch] rzou/realprop -> origin/rzou/realprop 2025-12-04T10:32:19.4170612Z * [new branch] samplevllm -> origin/samplevllm 2025-12-04T10:32:19.4170779Z * [new branch] sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm 2025-12-04T10:32:19.4170870Z * [new branch] sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA 2025-12-04T10:32:19.4170981Z * [new branch] sapling-pr-archive-tushar00jain -> origin/sapling-pr-archive-tushar00jain 2025-12-04T10:32:19.4171042Z * [new branch] save -> origin/save 2025-12-04T10:32:19.4171104Z * [new branch] scaled_mm -> origin/scaled_mm 2025-12-04T10:32:19.4187710Z * [new branch] scan_attempt -> origin/scan_attempt 2025-12-04T10:32:19.4187803Z * [new branch] sdym/2.5.1 -> origin/sdym/2.5.1 2025-12-04T10:32:19.4187944Z * [new branch] sekyondaMeta-dynamoconfig-fix -> origin/sekyondaMeta-dynamoconfig-fix 2025-12-04T10:32:19.4188035Z * [new branch] shengf/fx-xform-perf -> origin/shengf/fx-xform-perf 2025-12-04T10:32:19.4188116Z * [new branch] shoumikhin-patch-1 -> origin/shoumikhin-patch-1 2025-12-04T10:32:19.4188190Z * [new branch] solve-accuracy-fix -> origin/solve-accuracy-fix 2025-12-04T10:32:19.4188270Z * [new branch] some_rocm_inductor_skips -> origin/some_rocm_inductor_skips 2025-12-04T10:32:19.4188353Z * [new branch] soulitzer/stash-tls-ac -> origin/soulitzer/stash-tls-ac 2025-12-04T10:32:19.4188508Z * [new branch] sparse-mm-bf16-support -> origin/sparse-mm-bf16-support 2025-12-04T10:32:19.4188580Z * [new branch] starterTaskUpdate -> origin/starterTaskUpdate 2025-12-04T10:32:19.4188640Z * [new branch] suo -> origin/suo 2025-12-04T10:32:19.4188704Z * [new branch] sve-poc -> origin/sve-poc 2025-12-04T10:32:19.4188766Z * [new branch] switch-bn -> origin/switch-bn 2025-12-04T10:32:19.4188859Z * [new branch] sy_annotation_in_autograd_hop -> origin/sy_annotation_in_autograd_hop 2025-12-04T10:32:19.4188927Z * [new branch] sy_aot_eager_record -> origin/sy_aot_eager_record 2025-12-04T10:32:19.4188995Z * [new branch] sy_custom_bucketing -> origin/sy_custom_bucketing 2025-12-04T10:32:19.4189062Z * [new branch] sy_debug_mode_test -> origin/sy_debug_mode_test 2025-12-04T10:32:19.4189130Z * [new branch] sy_deserialize -> origin/sy_deserialize 2025-12-04T10:32:19.4189196Z * [new branch] sy_dump_gm_code -> origin/sy_dump_gm_code 2025-12-04T10:32:19.4189257Z * [new branch] sy_exp -> origin/sy_exp 2025-12-04T10:32:19.4189327Z * [new branch] sy_export_annotation -> origin/sy_export_annotation 2025-12-04T10:32:19.4189433Z * [new branch] sy_invoke_subgraph -> origin/sy_invoke_subgraph 2025-12-04T10:32:19.4189500Z * [new branch] sy_kernel_bw_name -> origin/sy_kernel_bw_name 2025-12-04T10:32:19.4189562Z * [new branch] sy_multi_arch -> origin/sy_multi_arch 2025-12-04T10:32:19.4189673Z * [new branch] sy_nn_module_stack -> origin/sy_nn_module_stack 2025-12-04T10:32:19.4189745Z * [new branch] sy_original_dtensor -> origin/sy_original_dtensor 2025-12-04T10:32:19.4189817Z * [new branch] sy_profiler_cia -> origin/sy_profiler_cia 2025-12-04T10:32:19.4189880Z * [new branch] symm_mem_sync -> origin/symm_mem_sync 2025-12-04T10:32:19.4189964Z * [new branch] sympy-bottleneck-repro -> origin/sympy-bottleneck-repro 2025-12-04T10:32:19.4190041Z * [new branch] tensordict_integration -> origin/tensordict_integration 2025-12-04T10:32:19.4190126Z * [new branch] test-move-conda-builds -> origin/test-move-conda-builds 2025-12-04T10:32:19.4190187Z * [new branch] test-old -> origin/test-old 2025-12-04T10:32:19.4190250Z * [new branch] test/bmm_heur -> origin/test/bmm_heur 2025-12-04T10:32:19.4190347Z * [new branch] tianren/customOp_autotune_fix -> origin/tianren/customOp_autotune_fix 2025-12-04T10:32:19.4190462Z * [new branch] tianren/customOp_enable_max_autotune -> origin/tianren/customOp_enable_max_autotune 2025-12-04T10:32:19.4190548Z * [new branch] tianren/customOp_fusion -> origin/tianren/customOp_fusion 2025-12-04T10:32:19.4190674Z * [new branch] tianren/customop_collectiveop_benchmark -> origin/tianren/customop_collectiveop_benchmark 2025-12-04T10:32:19.4190808Z * [new branch] tianren/customop_collectiveop_benchmark_fix -> origin/tianren/customop_collectiveop_benchmark_fix 2025-12-04T10:32:19.4190913Z * [new branch] tianren/customop_dynamic_config -> origin/tianren/customop_dynamic_config 2025-12-04T10:32:19.4191007Z * [new branch] tianren/dynamic_range_input -> origin/tianren/dynamic_range_input 2025-12-04T10:32:19.4191105Z * [new branch] tianren/dynamic_range_input_fix -> origin/tianren/dynamic_range_input_fix 2025-12-04T10:32:19.4191212Z * [new branch] tianren/dynamic_range_input_merge -> origin/tianren/dynamic_range_input_merge 2025-12-04T10:32:19.4191313Z * [new branch] tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp 2025-12-04T10:32:19.4191444Z * [new branch] tianren/fx_codegen_dump -> origin/tianren/fx_codegen_dump 2025-12-04T10:32:19.4191526Z * [new branch] tianren/symmetric_memory -> origin/tianren/symmetric_memory 2025-12-04T10:32:19.4191590Z * [new branch] tianren/test -> origin/tianren/test 2025-12-04T10:32:19.4191666Z * [new branch] tidy_performance_cyy -> origin/tidy_performance_cyy 2025-12-04T10:32:19.4191725Z * [new branch] tmp -> origin/tmp 2025-12-04T10:32:19.4191790Z * [new branch] torchtitan_ep -> origin/torchtitan_ep 2025-12-04T10:32:19.4191867Z * [new branch] torchtitan_integration -> origin/torchtitan_integration 2025-12-04T10:32:19.4191952Z * [new branch] trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora 2025-12-04T10:32:19.4192036Z * [new branch] traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests 2025-12-04T10:32:19.4192104Z * [new branch] tree_loop_vec_base -> origin/tree_loop_vec_base 2025-12-04T10:32:19.4192169Z * [new branch] triton_kernel -> origin/triton_kernel 2025-12-04T10:32:19.4192229Z * [new branch] tt_pkg_1908 -> origin/tt_pkg_1908 2025-12-04T10:32:19.4192332Z * [new branch] type_dec -> origin/type_dec 2025-12-04T10:32:19.4192426Z * [new branch] udate-sphinx-dependancies -> origin/udate-sphinx-dependancies 2025-12-04T10:32:19.4192565Z * [new branch] update-audio-commit-hash/17630256502-1803-1 -> origin/update-audio-commit-hash/17630256502-1803-1 2025-12-04T10:32:19.4192701Z * [new branch] update-audio-commit-hash/19087141161-1916-1 -> origin/update-audio-commit-hash/19087141161-1916-1 2025-12-04T10:32:19.4192831Z * [new branch] update-audio-commit-hash/19250643381-1929-1 -> origin/update-audio-commit-hash/19250643381-1929-1 2025-12-04T10:32:19.4192961Z * [new branch] update-audio-commit-hash/19397724337-1935-1 -> origin/update-audio-commit-hash/19397724337-1935-1 2025-12-04T10:32:19.4193092Z * [new branch] update-audio-commit-hash/19555670148-1941-1 -> origin/update-audio-commit-hash/19555670148-1941-1 2025-12-04T10:32:19.4193222Z * [new branch] update-audio-commit-hash/19750627930-1946-1 -> origin/update-audio-commit-hash/19750627930-1946-1 2025-12-04T10:32:19.4193356Z * [new branch] update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2 2025-12-04T10:32:19.4193490Z * [new branch] update-vision-commit-hash/19087141161-1916-1 -> origin/update-vision-commit-hash/19087141161-1916-1 2025-12-04T10:32:19.4193622Z * [new branch] update-vision-commit-hash/19184897099-1925-1 -> origin/update-vision-commit-hash/19184897099-1925-1 2025-12-04T10:32:19.4193754Z * [new branch] update-vision-commit-hash/19250643381-1929-1 -> origin/update-vision-commit-hash/19250643381-1929-1 2025-12-04T10:32:19.4193887Z * [new branch] update-vision-commit-hash/19381328640-1934-1 -> origin/update-vision-commit-hash/19381328640-1934-1 2025-12-04T10:32:19.4194019Z * [new branch] update-vision-commit-hash/19485237164-1938-1 -> origin/update-vision-commit-hash/19485237164-1938-1 2025-12-04T10:32:19.4194150Z * [new branch] update-vllm-commit-hash/18451675449-1879-1 -> origin/update-vllm-commit-hash/18451675449-1879-1 2025-12-04T10:32:19.4194233Z * [new branch] update-vllm-dockerfile -> origin/update-vllm-dockerfile 2025-12-04T10:32:19.4194357Z * [new branch] update-xla-commit-hash/19224287370-211-1 -> origin/update-xla-commit-hash/19224287370-211-1 2025-12-04T10:32:19.4194481Z * [new branch] update-xla-commit-hash/19422028566-212-1 -> origin/update-xla-commit-hash/19422028566-212-1 2025-12-04T10:32:19.4194631Z * [new branch] update-xla-commit-hash/19626841311-213-1 -> origin/update-xla-commit-hash/19626841311-213-1 2025-12-04T10:32:19.4194759Z * [new branch] update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388 2025-12-04T10:32:19.4194843Z * [new branch] update_operator_readme -> origin/update_operator_readme 2025-12-04T10:32:19.4194933Z * [new branch] update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736 2025-12-04T10:32:19.4195019Z * [new branch] update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173 2025-12-04T10:32:19.4195106Z * [new branch] update_slow_tests_1762155677 -> origin/update_slow_tests_1762155677 2025-12-04T10:32:19.4195190Z * [new branch] update_slow_tests_1763365283 -> origin/update_slow_tests_1763365283 2025-12-04T10:32:19.4195288Z * [new branch] update_submodule_FBGEMM -> origin/update_submodule_FBGEMM 2025-12-04T10:32:19.4195366Z * [new branch] update_submodule_kineto -> origin/update_submodule_kineto 2025-12-04T10:32:19.4195455Z * [new branch] update_submodule_tensorpipe -> origin/update_submodule_tensorpipe 2025-12-04T10:32:19.4195557Z * [new branch] upload-tests-for-autorevert -> origin/upload-tests-for-autorevert 2025-12-04T10:32:19.4195645Z * [new branch] v0.1.2 -> origin/v0.1.2 2025-12-04T10:32:19.4195705Z * [new branch] v1.0.1 -> origin/v1.0.1 2025-12-04T10:32:19.4195767Z * [new branch] v1.0.3 -> origin/v1.0.3 2025-12-04T10:32:19.4195822Z * [new branch] v1.1.0 -> origin/v1.1.0 2025-12-04T10:32:19.4195878Z * [new branch] v1.2.0 -> origin/v1.2.0 2025-12-04T10:32:19.4195936Z * [new branch] v1.3.0 -> origin/v1.3.0 2025-12-04T10:32:19.4195992Z * [new branch] v1.3.1 -> origin/v1.3.1 2025-12-04T10:32:19.4196056Z * [new branch] validate_fn -> origin/validate_fn 2025-12-04T10:32:19.4196124Z * [new branch] validations_2.6 -> origin/validations_2.6 2025-12-04T10:32:19.4196192Z * [new branch] validations_2.8 -> origin/validations_2.8 2025-12-04T10:32:19.4196256Z * [new branch] varlen-api -> origin/varlen-api 2025-12-04T10:32:19.4196331Z * [new branch] varlen-api-backup -> origin/varlen-api-backup 2025-12-04T10:32:19.4196405Z * [new branch] varlen_batch_invariance -> origin/varlen_batch_invariance 2025-12-04T10:32:19.4196468Z * [new branch] viable/strict -> origin/viable/strict 2025-12-04T10:32:19.4196586Z * [new branch] vishal9-team/dtensor_parallelism_toy -> origin/vishal9-team/dtensor_parallelism_toy 2025-12-04T10:32:19.4196650Z * [new branch] vllmbuildci -> origin/vllmbuildci 2025-12-04T10:32:19.4196710Z * [new branch] vllmpin -> origin/vllmpin 2025-12-04T10:32:19.4196802Z * [new branch] vscode-recommend-pyrefly -> origin/vscode-recommend-pyrefly 2025-12-04T10:32:19.4196869Z * [new branch] wdvr-patch-1 -> origin/wdvr-patch-1 2025-12-04T10:32:19.4196934Z * [new branch] wdvr/iss_145259 -> origin/wdvr/iss_145259 2025-12-04T10:32:19.4196996Z * [new branch] whc/pei -> origin/whc/pei 2025-12-04T10:32:19.4197061Z * [new branch] whc/pp_fix -> origin/whc/pp_fix 2025-12-04T10:32:19.4197124Z * [new branch] whc/sharding -> origin/whc/sharding 2025-12-04T10:32:19.4197188Z * [new branch] whc/sharding2 -> origin/whc/sharding2 2025-12-04T10:32:19.4197292Z * [new branch] whc/uneven -> origin/whc/uneven 2025-12-04T10:32:19.4197361Z * [new branch] whc/uneven-merge -> origin/whc/uneven-merge 2025-12-04T10:32:19.4197423Z * [new branch] win_warnings -> origin/win_warnings 2025-12-04T10:32:19.4197497Z * [new branch] windows_libtorch_free -> origin/windows_libtorch_free 2025-12-04T10:32:19.4197561Z * [new branch] xmfan-war -> origin/xmfan-war 2025-12-04T10:32:19.4197625Z * [new branch] xmfan/ca_0516 -> origin/xmfan/ca_0516 2025-12-04T10:32:19.4197694Z * [new branch] xmfan/ca_1051b93192 -> origin/xmfan/ca_1051b93192 2025-12-04T10:32:19.4197846Z * [new branch] xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 2025-12-04T10:32:19.4197917Z * [new branch] xmfan/ca_5a2be192d1 -> origin/xmfan/ca_5a2be192d1 2025-12-04T10:32:19.4197988Z * [new branch] xmfan/ca_9d59b516e9 -> origin/xmfan/ca_9d59b516e9 2025-12-04T10:32:19.4198051Z * [new branch] xmfan/ca_apr8 -> origin/xmfan/ca_apr8 2025-12-04T10:32:19.4198114Z * [new branch] xmfan/ca_base -> origin/xmfan/ca_base 2025-12-04T10:32:19.4198181Z * [new branch] xmfan/ca_dynamic -> origin/xmfan/ca_dynamic 2025-12-04T10:32:19.4198279Z * [new branch] xmfan/ca_fix_dyn -> origin/xmfan/ca_fix_dyn 2025-12-04T10:32:19.4198351Z * [new branch] xmfan/ca_fix_lowering -> origin/xmfan/ca_fix_lowering 2025-12-04T10:32:19.4198425Z * [new branch] xmfan/ca_fix_polyfills -> origin/xmfan/ca_fix_polyfills 2025-12-04T10:32:19.4198489Z * [new branch] xmfan/ca_jan3 -> origin/xmfan/ca_jan3 2025-12-04T10:32:19.4198553Z * [new branch] xmfan/ca_jun18 -> origin/xmfan/ca_jun18 2025-12-04T10:32:19.4198618Z * [new branch] xmfan/ca_jun24 -> origin/xmfan/ca_jun24 2025-12-04T10:32:19.4198684Z * [new branch] xmfan/ca_nested -> origin/xmfan/ca_nested 2025-12-04T10:32:19.4198750Z * [new branch] xmfan/ca_overhead -> origin/xmfan/ca_overhead 2025-12-04T10:32:19.4198841Z * [new branch] xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451 2025-12-04T10:32:19.4198910Z * [new branch] xmfan/cacu_jun18 -> origin/xmfan/cacu_jun18 2025-12-04T10:32:19.4198976Z * [new branch] xmfan/cacu_jun19 -> origin/xmfan/cacu_jun19 2025-12-04T10:32:19.4199040Z * [new branch] xmfan/cacu_jun4 -> origin/xmfan/cacu_jun4 2025-12-04T10:32:19.4199123Z * [new branch] xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape 2025-12-04T10:32:19.4199221Z * [new branch] xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough 2025-12-04T10:32:19.4199371Z * [new branch] xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T10:32:19.4199517Z * [new branch] xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 2025-12-04T10:32:19.4199618Z * [new branch] xmfan/single_step -> origin/xmfan/single_step 2025-12-04T10:32:19.4199685Z * [new branch] xmfan/sth_0829 -> origin/xmfan/sth_0829 2025-12-04T10:32:19.4199746Z * [new branch] xmfan/test -> origin/xmfan/test 2025-12-04T10:32:19.4199832Z * [new branch] yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr 2025-12-04T10:32:19.4199910Z * [new branch] yguo/new_latest_changes -> origin/yguo/new_latest_changes 2025-12-04T10:32:19.4200005Z * [new branch] yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes 2025-12-04T10:32:19.4200126Z * [new branch] yiming/bootcamp -> origin/yiming/bootcamp 2025-12-04T10:32:19.4200229Z * [new branch] yiming/run_with_start_end_rng_hop -> origin/yiming/run_with_start_end_rng_hop 2025-12-04T10:32:19.4200292Z * [new branch] yolo-llama3 -> origin/yolo-llama3 2025-12-04T10:32:19.4200364Z * [new branch] zainr/canary-test -> origin/zainr/canary-test 2025-12-04T10:32:19.4200453Z * [new branch] zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners 2025-12-04T10:32:19.4200533Z * [new branch] zainr/pull-migration-c -> origin/zainr/pull-migration-c 2025-12-04T10:32:19.4200595Z * [new branch] zainr/test2 -> origin/zainr/test2 2025-12-04T10:32:19.4200669Z * [new branch] zasdfgbnm-patch-3 -> origin/zasdfgbnm-patch-3 2025-12-04T10:32:19.4200727Z * [new branch] zb2p -> origin/zb2p 2025-12-04T10:32:19.4200812Z * [new branch] zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2 2025-12-04T10:32:19.4200901Z * [new branch] zhxchen17/ci/vllm_lora_oom -> origin/zhxchen17/ci/vllm_lora_oom 2025-12-04T10:32:19.4201004Z * [new branch] zhxchen17/ci/vllm_multimodal_oom -> origin/zhxchen17/ci/vllm_multimodal_oom 2025-12-04T10:32:19.4201119Z * [new branch] zhxchen17/ci/vllm_pin -> origin/zhxchen17/ci/vllm_pin 2025-12-04T10:32:19.4201243Z * [new branch] zhxchen17/dynamo/unsafe_drop_all_guards -> origin/zhxchen17/dynamo/unsafe_drop_all_guards 2025-12-04T10:32:19.4201341Z * [new branch] zhxchen17/export/call_override -> origin/zhxchen17/export/call_override 2025-12-04T10:32:19.4201426Z * [new branch] zhxchen17/export/codemod1 -> origin/zhxchen17/export/codemod1 2025-12-04T10:32:19.4201515Z * [new branch] zhxchen17/export/ctx_return -> origin/zhxchen17/export/ctx_return 2025-12-04T10:32:19.4201643Z * [new branch] zhxchen17/export/disable_side_effect_warn -> origin/zhxchen17/export/disable_side_effect_warn 2025-12-04T10:32:19.4201744Z * [new branch] zhxchen17/export/pytree_check -> origin/zhxchen17/export/pytree_check 2025-12-04T10:32:19.4201830Z * [new branch] zhxchen17/precompile/aoti -> origin/zhxchen17/precompile/aoti 2025-12-04T10:32:19.4201926Z * [new branch] zhxchen17/precompile/globals -> origin/zhxchen17/precompile/globals 2025-12-04T10:32:19.4202042Z * [new branch] zhxchen17/precompile/inductor_guards -> origin/zhxchen17/precompile/inductor_guards 2025-12-04T10:32:19.4202114Z * [new branch] zhxchen17/scratch/0 -> origin/zhxchen17/scratch/0 2025-12-04T10:32:19.4202219Z * [new branch] zhxchen17/torch_export_api_update -> origin/zhxchen17/torch_export_api_update 2025-12-04T10:32:19.4202295Z * [new branch] zhxhcen17/moodycamel -> origin/zhxhcen17/moodycamel 2025-12-04T10:32:19.4202370Z * [new branch] zxiiro/build-times -> origin/zxiiro/build-times 2025-12-04T10:32:19.4202442Z * [new branch] zxiiro/c7i.2xlarge -> origin/zxiiro/c7i.2xlarge 2025-12-04T10:32:19.4202520Z * [new branch] zxiiro/c7i.2xlarge.h100 -> origin/zxiiro/c7i.2xlarge.h100 2025-12-04T10:32:19.4202582Z * [new branch] zxiiro/main -> origin/zxiiro/main 2025-12-04T10:32:19.4202645Z * [new branch] zxiiro/risc64 -> origin/zxiiro/risc64 2025-12-04T10:32:19.4202737Z * [new branch] zxiiro/test-multicloud-arc -> origin/zxiiro/test-multicloud-arc 2025-12-04T10:32:19.4202807Z t [tag update] ciflow/inductor/169437 -> ciflow/inductor/169437 2025-12-04T10:32:19.4202872Z t [tag update] ciflow/trunk/169437 -> ciflow/trunk/169437 2025-12-04T10:32:19.4203008Z * [new tag] trunk/c0cb6e78404416d418350632bfc554710a5f7281 -> trunk/c0cb6e78404416d418350632bfc554710a5f7281 2025-12-04T10:32:19.6164008Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T10:32:19.6350222Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:19.6354404Z ##[endgroup] 2025-12-04T10:32:19.6354728Z ##[group]Determining the checkout info 2025-12-04T10:32:19.6356126Z ##[endgroup] 2025-12-04T10:32:19.6361308Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T10:32:19.6449711Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T10:32:19.6466842Z ##[group]Checking out the ref 2025-12-04T10:32:19.6468683Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:19.7286564Z Previous HEAD position was c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452) 2025-12-04T10:32:19.7290133Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T10:32:19.7373501Z ##[endgroup] 2025-12-04T10:32:19.7373718Z ##[group]Setting up auth for fetching submodules 2025-12-04T10:32:19.7379138Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T10:32:19.7405890Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T10:32:19.7425617Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T10:32:19.7449221Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T10:32:19.7464383Z ##[endgroup] 2025-12-04T10:32:19.7464606Z ##[group]Fetching submodules 2025-12-04T10:32:19.7466002Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T10:32:19.7688134Z Synchronizing submodule url for 'android/libs/fbjni' 2025-12-04T10:32:19.7701321Z Synchronizing submodule url for 'third_party/FP16' 2025-12-04T10:32:19.7718513Z Synchronizing submodule url for 'third_party/FXdiv' 2025-12-04T10:32:19.7733257Z Synchronizing submodule url for 'third_party/NNPACK' 2025-12-04T10:32:19.7750335Z Synchronizing submodule url for 'third_party/NVTX' 2025-12-04T10:32:19.7761648Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:19.7774687Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-12-04T10:32:19.7794470Z Synchronizing submodule url for 'third_party/aiter' 2025-12-04T10:32:19.7808607Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:19.7826021Z Synchronizing submodule url for 'third_party/benchmark' 2025-12-04T10:32:19.7836476Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-12-04T10:32:19.7852734Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-12-04T10:32:19.7867806Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-12-04T10:32:19.7878357Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-12-04T10:32:19.7887699Z Synchronizing submodule url for 'third_party/cutlass' 2025-12-04T10:32:19.7899962Z Synchronizing submodule url for 'third_party/fbgemm' 2025-12-04T10:32:19.7911819Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:19.7923035Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:19.7947817Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:19.7966961Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:19.7989408Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:19.8000711Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:19.8011896Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-12-04T10:32:19.8026915Z Synchronizing submodule url for 'third_party/flash-attention' 2025-12-04T10:32:19.8039360Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:19.8051927Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:19.8066604Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-12-04T10:32:19.8078233Z Synchronizing submodule url for 'third_party/fmt' 2025-12-04T10:32:19.8090239Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:19.8100979Z Synchronizing submodule url for 'third_party/gloo' 2025-12-04T10:32:19.8111291Z Synchronizing submodule url for 'third_party/googletest' 2025-12-04T10:32:19.8122409Z Synchronizing submodule url for 'third_party/ideep' 2025-12-04T10:32:19.8133659Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:19.8148095Z Synchronizing submodule url for 'third_party/ittapi' 2025-12-04T10:32:19.8158384Z Synchronizing submodule url for 'third_party/kineto' 2025-12-04T10:32:19.8171135Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:19.8180946Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:19.8191620Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:19.8205322Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:19.8215216Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:19.8224417Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:19.8236013Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:19.8246644Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:19.8258467Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:19.8269922Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:19.8280855Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:19.8291522Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:19.8302446Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:19.8315033Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:19.8328350Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:19.8344735Z Synchronizing submodule url for 'third_party/kleidiai' 2025-12-04T10:32:19.8358000Z Synchronizing submodule url for 'third_party/mimalloc' 2025-12-04T10:32:19.8368711Z Synchronizing submodule url for 'third_party/nlohmann' 2025-12-04T10:32:19.8379392Z Synchronizing submodule url for 'third_party/onnx' 2025-12-04T10:32:19.8396221Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:19.8408437Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-12-04T10:32:19.8426048Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:19.8435504Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:19.8446008Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:19.8459687Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:19.8470150Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:19.8479800Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:19.8491503Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:19.8501365Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:19.8515277Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:19.8528760Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:19.8558171Z Synchronizing submodule url for 'third_party/pocketfft' 2025-12-04T10:32:19.8567509Z Synchronizing submodule url for 'third_party/protobuf' 2025-12-04T10:32:19.8579530Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:19.8591200Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:19.8607385Z Synchronizing submodule url for 'third_party/psimd' 2025-12-04T10:32:19.8618528Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-12-04T10:32:19.8634543Z Synchronizing submodule url for 'third_party/pybind11' 2025-12-04T10:32:19.8645346Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-12-04T10:32:19.8655932Z Synchronizing submodule url for 'third_party/sleef' 2025-12-04T10:32:19.8665434Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-12-04T10:32:19.8675049Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:19.8685860Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:19.8696538Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:19.8708586Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:19.8720730Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:19.8746275Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T10:32:19.8999009Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T10:32:19.9064250Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T10:32:19.9122804Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T10:32:19.9263826Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T10:32:19.9336858Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T10:32:19.9393406Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T10:32:20.0251980Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T10:32:20.0406013Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T10:32:20.0586648Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T10:32:20.0704917Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T10:32:20.0877903Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T10:32:20.0942033Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T10:32:20.1585892Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T10:32:20.1692918Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T10:32:20.1818671Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T10:32:20.2573644Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T10:32:20.2886525Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T10:32:20.3651157Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T10:32:20.4308795Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T10:32:20.8791388Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T10:32:20.9015243Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T10:32:20.9122961Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T10:32:20.9687554Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T10:32:20.9780524Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T10:32:21.0000127Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T10:32:21.0133982Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T10:32:21.0225579Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T10:32:21.0382847Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T10:32:21.0594493Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T10:32:21.0711877Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T10:32:21.0907674Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T10:32:21.0986835Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T10:32:21.4210912Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T10:32:21.4313603Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T10:32:21.4406850Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T10:32:21.4498167Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T10:32:21.4578854Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T10:32:21.4658348Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T10:32:21.4740258Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T10:32:21.4792524Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T10:32:21.4867967Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T10:32:21.4942080Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T10:32:21.5005388Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T10:32:21.5097507Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T10:32:21.5159763Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T10:32:21.5231241Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T10:32:21.5307939Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T10:32:21.5363797Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T10:32:21.5434860Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T10:32:21.5488428Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T10:32:21.5571467Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T10:32:21.5636601Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T10:32:21.5730837Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T10:32:21.7492649Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T10:32:21.7674352Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T10:32:21.7793402Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T10:32:21.7861875Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T10:32:21.7929487Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T10:32:21.7977576Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T10:32:21.8059745Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T10:32:21.8121124Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T10:32:21.8183471Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T10:32:21.8244915Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T10:32:21.8338822Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T10:32:21.8421394Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T10:32:21.8581965Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T10:32:21.8651048Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T10:32:21.9947765Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T10:32:22.0033826Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T10:32:22.0246965Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T10:32:22.0310746Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T10:32:22.0401000Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T10:32:22.0585524Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T10:32:22.0820246Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T10:32:22.1077587Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T10:32:22.1188388Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T10:32:22.1382063Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T10:32:22.1464080Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T10:32:22.1763112Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T10:32:22.1892261Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T10:32:22.1955598Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T10:32:22.1983353Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T10:32:22.2214182Z Entering 'android/libs/fbjni' 2025-12-04T10:32:22.2242572Z Entering 'third_party/FP16' 2025-12-04T10:32:22.2269455Z Entering 'third_party/FXdiv' 2025-12-04T10:32:22.2293749Z Entering 'third_party/NNPACK' 2025-12-04T10:32:22.2327323Z Entering 'third_party/NVTX' 2025-12-04T10:32:22.2354968Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:22.2377861Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:22.2401532Z Entering 'third_party/aiter' 2025-12-04T10:32:22.2419394Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:22.2451712Z Entering 'third_party/benchmark' 2025-12-04T10:32:22.2483595Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:22.2524385Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:22.2548500Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:22.2568650Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:22.2586976Z Entering 'third_party/cutlass' 2025-12-04T10:32:22.2613134Z Entering 'third_party/fbgemm' 2025-12-04T10:32:22.2636601Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:22.2662845Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:22.2687140Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:22.2714947Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:22.2737055Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:22.2755483Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:22.2772953Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:22.2793367Z Entering 'third_party/flash-attention' 2025-12-04T10:32:22.2819081Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:22.2853141Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:22.2883912Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:22.2906749Z Entering 'third_party/fmt' 2025-12-04T10:32:22.2930468Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:22.2951329Z Entering 'third_party/gloo' 2025-12-04T10:32:22.2973348Z Entering 'third_party/googletest' 2025-12-04T10:32:22.2992276Z Entering 'third_party/ideep' 2025-12-04T10:32:22.3010999Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:22.3038839Z Entering 'third_party/ittapi' 2025-12-04T10:32:22.3057729Z Entering 'third_party/kineto' 2025-12-04T10:32:22.3081330Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:22.3101975Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:22.3123313Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:22.3143380Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:22.3166419Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:22.3193458Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:22.3223812Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:22.3244284Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:22.3267845Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:22.3287282Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:22.3305544Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:22.3324269Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:22.3345996Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:22.3376211Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:22.3398733Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:22.3421195Z Entering 'third_party/kleidiai' 2025-12-04T10:32:22.3441537Z Entering 'third_party/mimalloc' 2025-12-04T10:32:22.3461762Z Entering 'third_party/nlohmann' 2025-12-04T10:32:22.3481618Z Entering 'third_party/onnx' 2025-12-04T10:32:22.3507177Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:22.3530476Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:22.3558878Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:22.3579073Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:22.3603333Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:22.3624825Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:22.3645789Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:22.3663812Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:22.3683011Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:22.3706791Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:22.3727597Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:22.3747581Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:22.3774749Z Entering 'third_party/pocketfft' 2025-12-04T10:32:22.3794371Z Entering 'third_party/protobuf' 2025-12-04T10:32:22.3813682Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:22.3836870Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:22.3860038Z Entering 'third_party/psimd' 2025-12-04T10:32:22.3880890Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:22.3907544Z Entering 'third_party/pybind11' 2025-12-04T10:32:22.3928003Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:22.3947804Z Entering 'third_party/sleef' 2025-12-04T10:32:22.3969476Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:22.3990889Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:22.4007553Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:22.4025570Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:22.4048139Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:22.4066363Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:22.4096503Z ##[endgroup] 2025-12-04T10:32:22.4096691Z ##[group]Persisting credentials for submodules 2025-12-04T10:32:22.4101572Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T10:32:22.4259562Z Entering 'android/libs/fbjni' 2025-12-04T10:32:22.4282892Z Entering 'third_party/FP16' 2025-12-04T10:32:22.4304237Z Entering 'third_party/FXdiv' 2025-12-04T10:32:22.4325336Z Entering 'third_party/NNPACK' 2025-12-04T10:32:22.4348464Z Entering 'third_party/NVTX' 2025-12-04T10:32:22.4370100Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:22.4396581Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:22.4426890Z Entering 'third_party/aiter' 2025-12-04T10:32:22.4454882Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:22.4479907Z Entering 'third_party/benchmark' 2025-12-04T10:32:22.4502218Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:22.4525737Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:22.4550709Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:22.4574082Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:22.4594205Z Entering 'third_party/cutlass' 2025-12-04T10:32:22.4618515Z Entering 'third_party/fbgemm' 2025-12-04T10:32:22.4641342Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:22.4668524Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:22.4697699Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:22.4719775Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:22.4741757Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:22.4767562Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:22.4789484Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:22.4819398Z Entering 'third_party/flash-attention' 2025-12-04T10:32:22.4841397Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:22.4867478Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:22.4901416Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:22.4924067Z Entering 'third_party/fmt' 2025-12-04T10:32:22.4945462Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:22.4965784Z Entering 'third_party/gloo' 2025-12-04T10:32:22.4986790Z Entering 'third_party/googletest' 2025-12-04T10:32:22.5013984Z Entering 'third_party/ideep' 2025-12-04T10:32:22.5039850Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:22.5066515Z Entering 'third_party/ittapi' 2025-12-04T10:32:22.5088930Z Entering 'third_party/kineto' 2025-12-04T10:32:22.5110671Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:22.5145307Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:22.5168187Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:22.5191417Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:22.5212833Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:22.5234780Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:22.5256620Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:22.5278312Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:22.5298532Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:22.5323543Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:22.5342562Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:22.5363464Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:22.5384286Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:22.5412037Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:22.5434395Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:22.5455500Z Entering 'third_party/kleidiai' 2025-12-04T10:32:22.5479838Z Entering 'third_party/mimalloc' 2025-12-04T10:32:22.5508156Z Entering 'third_party/nlohmann' 2025-12-04T10:32:22.5536596Z Entering 'third_party/onnx' 2025-12-04T10:32:22.5564969Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:22.5592915Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:22.5619442Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:22.5641571Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:22.5663212Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:22.5692928Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:22.5715598Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:22.5736067Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:22.5755484Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:22.5777464Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:22.5799264Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:22.5825749Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:22.5855508Z Entering 'third_party/pocketfft' 2025-12-04T10:32:22.5877879Z Entering 'third_party/protobuf' 2025-12-04T10:32:22.5899150Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:22.5922744Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:22.5943770Z Entering 'third_party/psimd' 2025-12-04T10:32:22.5965143Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:22.5984750Z Entering 'third_party/pybind11' 2025-12-04T10:32:22.6013734Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:22.6047301Z Entering 'third_party/sleef' 2025-12-04T10:32:22.6071038Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:22.6094326Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:22.6132465Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:22.6155999Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:22.6181461Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:22.6207135Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:22.6249189Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T10:32:22.6418143Z Entering 'android/libs/fbjni' 2025-12-04T10:32:22.6447151Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T10:32:22.6455823Z Entering 'third_party/FP16' 2025-12-04T10:32:22.6475406Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T10:32:22.6485176Z Entering 'third_party/FXdiv' 2025-12-04T10:32:22.6507037Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T10:32:22.6517342Z Entering 'third_party/NNPACK' 2025-12-04T10:32:22.6538240Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T10:32:22.6554270Z Entering 'third_party/NVTX' 2025-12-04T10:32:22.6573143Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T10:32:22.6587110Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:22.6610725Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T10:32:22.6620464Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:22.6639074Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T10:32:22.6653667Z Entering 'third_party/aiter' 2025-12-04T10:32:22.6673347Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T10:32:22.6682966Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:22.6715891Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T10:32:22.6731776Z Entering 'third_party/benchmark' 2025-12-04T10:32:22.6759558Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:22.6769843Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:22.6800872Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T10:32:22.6814264Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:22.6838933Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T10:32:22.6850101Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:22.6878101Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T10:32:22.6888759Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:22.6914950Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T10:32:22.6929229Z Entering 'third_party/cutlass' 2025-12-04T10:32:22.6968263Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T10:32:22.6991746Z Entering 'third_party/fbgemm' 2025-12-04T10:32:22.7025923Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T10:32:22.7046836Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:22.7068416Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T10:32:22.7084585Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:22.7105212Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T10:32:22.7118757Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:22.7143421Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T10:32:22.7157467Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:22.7187523Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T10:32:22.7212086Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:22.7241163Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T10:32:22.7253074Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:22.7275812Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T10:32:22.7294365Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:22.7320999Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T10:32:22.7336220Z Entering 'third_party/flash-attention' 2025-12-04T10:32:22.7359799Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T10:32:22.7374440Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:22.7397858Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T10:32:22.7415454Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:22.7449303Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T10:32:22.7463865Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:22.7483903Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T10:32:22.7496507Z Entering 'third_party/fmt' 2025-12-04T10:32:22.7533575Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T10:32:22.7545116Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:22.7567042Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T10:32:22.7575900Z Entering 'third_party/gloo' 2025-12-04T10:32:22.7600224Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T10:32:22.7610999Z Entering 'third_party/googletest' 2025-12-04T10:32:22.7631531Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:22.7640914Z Entering 'third_party/ideep' 2025-12-04T10:32:22.7664811Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T10:32:22.7674984Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:22.7702241Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T10:32:22.7721719Z Entering 'third_party/ittapi' 2025-12-04T10:32:22.7743800Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T10:32:22.7755270Z Entering 'third_party/kineto' 2025-12-04T10:32:22.7787506Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T10:32:22.7796983Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:22.7817848Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T10:32:22.7827110Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:22.7847585Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T10:32:22.7856920Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:22.7882819Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T10:32:22.7895476Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:22.7920268Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T10:32:22.7932391Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:22.7957416Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T10:32:22.7967802Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:22.7992398Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T10:32:22.8004578Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:22.8032140Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T10:32:22.8041946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:22.8073518Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:22.8084470Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:22.8113376Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T10:32:22.8126394Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:22.8153427Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T10:32:22.8169729Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:22.8199526Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T10:32:22.8211217Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:22.8234080Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T10:32:22.8245031Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:22.8267383Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T10:32:22.8280472Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:22.8299827Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T10:32:22.8309186Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:22.8328169Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T10:32:22.8339335Z Entering 'third_party/kleidiai' 2025-12-04T10:32:22.8363509Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T10:32:22.8373448Z Entering 'third_party/mimalloc' 2025-12-04T10:32:22.8397137Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T10:32:22.8405989Z Entering 'third_party/nlohmann' 2025-12-04T10:32:22.8617845Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T10:32:22.8628087Z Entering 'third_party/onnx' 2025-12-04T10:32:22.8648873Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T10:32:22.8674923Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:22.8704690Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:22.8717255Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:22.8743268Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T10:32:22.8753158Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:22.8775577Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:22.8784228Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:22.8802473Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:22.8811648Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:22.8838382Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T10:32:22.8847948Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:22.8865417Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T10:32:22.8873860Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:22.8898876Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T10:32:22.8909670Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:22.8928399Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T10:32:22.8937464Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:22.8954920Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T10:32:22.8963818Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:22.8997477Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T10:32:22.9013231Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:22.9036947Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T10:32:22.9048060Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:22.9068786Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T10:32:22.9092044Z Entering 'third_party/pocketfft' 2025-12-04T10:32:22.9113918Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T10:32:22.9123407Z Entering 'third_party/protobuf' 2025-12-04T10:32:22.9141713Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T10:32:22.9152260Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:22.9174655Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:22.9185053Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:22.9204191Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:22.9215127Z Entering 'third_party/psimd' 2025-12-04T10:32:22.9234155Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T10:32:22.9243558Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:22.9268082Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T10:32:22.9278522Z Entering 'third_party/pybind11' 2025-12-04T10:32:22.9299740Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:22.9308446Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:22.9329686Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T10:32:22.9339950Z Entering 'third_party/sleef' 2025-12-04T10:32:22.9360119Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T10:32:22.9369707Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:22.9389883Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T10:32:22.9399127Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:22.9418261Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:22.9429239Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:22.9446414Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T10:32:22.9454876Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:22.9473624Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T10:32:22.9487375Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:22.9516599Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:22.9527030Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:22.9550359Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T10:32:22.9985720Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T10:32:23.0189735Z Entering 'android/libs/fbjni' 2025-12-04T10:32:23.0213989Z Entering 'third_party/FP16' 2025-12-04T10:32:23.0236272Z Entering 'third_party/FXdiv' 2025-12-04T10:32:23.0262997Z Entering 'third_party/NNPACK' 2025-12-04T10:32:23.0288295Z Entering 'third_party/NVTX' 2025-12-04T10:32:23.0314300Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:23.0340743Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:23.0371654Z Entering 'third_party/aiter' 2025-12-04T10:32:23.0403291Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:23.0434271Z Entering 'third_party/benchmark' 2025-12-04T10:32:23.0460064Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:23.0495390Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:23.0518199Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:23.0543973Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:23.0566349Z Entering 'third_party/cutlass' 2025-12-04T10:32:23.0597845Z Entering 'third_party/fbgemm' 2025-12-04T10:32:23.0620800Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:23.0640761Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:23.0673481Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:23.0704215Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:23.0731920Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:23.0754488Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:23.0782593Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:23.0812007Z Entering 'third_party/flash-attention' 2025-12-04T10:32:23.0834459Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:23.0865852Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:23.0896506Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:23.0918853Z Entering 'third_party/fmt' 2025-12-04T10:32:23.0944378Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:23.0965978Z Entering 'third_party/gloo' 2025-12-04T10:32:23.0989418Z Entering 'third_party/googletest' 2025-12-04T10:32:23.1013810Z Entering 'third_party/ideep' 2025-12-04T10:32:23.1041727Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:23.1067654Z Entering 'third_party/ittapi' 2025-12-04T10:32:23.1088854Z Entering 'third_party/kineto' 2025-12-04T10:32:23.1113114Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:23.1143532Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:23.1169886Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:23.1196704Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:23.1218327Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:23.1238335Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:23.1263558Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:23.1291737Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:23.1319306Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:23.1340168Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:23.1360979Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:23.1385111Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:23.1410437Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:23.1437997Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:23.1458593Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:23.1483152Z Entering 'third_party/kleidiai' 2025-12-04T10:32:23.1513047Z Entering 'third_party/mimalloc' 2025-12-04T10:32:23.1542105Z Entering 'third_party/nlohmann' 2025-12-04T10:32:23.1570334Z Entering 'third_party/onnx' 2025-12-04T10:32:23.1617369Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:23.1643243Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:23.1675471Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:23.1698741Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:23.1725600Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:23.1750244Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:23.1770055Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:23.1793047Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:23.1818535Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:23.1839804Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:23.1867402Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:23.1890478Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:23.1933177Z Entering 'third_party/pocketfft' 2025-12-04T10:32:23.1954559Z Entering 'third_party/protobuf' 2025-12-04T10:32:23.1985268Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:23.2016334Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:23.2041547Z Entering 'third_party/psimd' 2025-12-04T10:32:23.2062818Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:23.2090922Z Entering 'third_party/pybind11' 2025-12-04T10:32:23.2119880Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:23.2144942Z Entering 'third_party/sleef' 2025-12-04T10:32:23.2170979Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:23.2193533Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:23.2212354Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:23.2235003Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:23.2261476Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:23.2283521Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:23.2320748Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T10:32:23.2504429Z Entering 'android/libs/fbjni' 2025-12-04T10:32:23.2528263Z Entering 'third_party/FP16' 2025-12-04T10:32:23.2547482Z Entering 'third_party/FXdiv' 2025-12-04T10:32:23.2568928Z Entering 'third_party/NNPACK' 2025-12-04T10:32:23.2587577Z Entering 'third_party/NVTX' 2025-12-04T10:32:23.2606781Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:23.2628608Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:23.2662874Z Entering 'third_party/aiter' 2025-12-04T10:32:23.2685927Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:23.2715944Z Entering 'third_party/benchmark' 2025-12-04T10:32:23.2734341Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:23.2757115Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:23.2776753Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:23.2796534Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:23.2817121Z Entering 'third_party/cutlass' 2025-12-04T10:32:23.2848167Z Entering 'third_party/fbgemm' 2025-12-04T10:32:23.2882557Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:23.2912781Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:23.2947971Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:23.2984854Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:23.3008228Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:23.3033800Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:23.3054960Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:23.3085084Z Entering 'third_party/flash-attention' 2025-12-04T10:32:23.3113149Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:23.3143049Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:23.3168759Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:23.3201215Z Entering 'third_party/fmt' 2025-12-04T10:32:23.3225069Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:23.3245824Z Entering 'third_party/gloo' 2025-12-04T10:32:23.3270968Z Entering 'third_party/googletest' 2025-12-04T10:32:23.3293118Z Entering 'third_party/ideep' 2025-12-04T10:32:23.3315958Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:23.3339652Z Entering 'third_party/ittapi' 2025-12-04T10:32:23.3365386Z Entering 'third_party/kineto' 2025-12-04T10:32:23.3384837Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:23.3406908Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:23.3427151Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:23.3445056Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:23.3470467Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:23.3490905Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:23.3514171Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:23.3533075Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:23.3550304Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:23.3572690Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:23.3596714Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:23.3616381Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:23.3638896Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:23.3661819Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:23.3680526Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:23.3703806Z Entering 'third_party/kleidiai' 2025-12-04T10:32:23.3723568Z Entering 'third_party/mimalloc' 2025-12-04T10:32:23.3745112Z Entering 'third_party/nlohmann' 2025-12-04T10:32:23.3766857Z Entering 'third_party/onnx' 2025-12-04T10:32:23.3791585Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:23.3814327Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:23.3835846Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:23.3856401Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:23.3877568Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:23.3896419Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:23.3922425Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:23.3947248Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:23.3965928Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:23.3984724Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:23.4023460Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:23.4047417Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:23.4075863Z Entering 'third_party/pocketfft' 2025-12-04T10:32:23.4098662Z Entering 'third_party/protobuf' 2025-12-04T10:32:23.4119351Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:23.4137974Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:23.4157946Z Entering 'third_party/psimd' 2025-12-04T10:32:23.4177422Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:23.4197002Z Entering 'third_party/pybind11' 2025-12-04T10:32:23.4216515Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:23.4240712Z Entering 'third_party/sleef' 2025-12-04T10:32:23.4259698Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:23.4278697Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:23.4308355Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:23.4327620Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:23.4347924Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:23.4366108Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:23.4396362Z ##[endgroup] 2025-12-04T10:32:23.4557690Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T10:32:23.4646281Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:23.4763979Z ##[group]Run actions/checkout@v4 2025-12-04T10:32:23.4764117Z with: 2025-12-04T10:32:23.4764226Z ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:23.4764371Z fetch-depth: 0 2025-12-04T10:32:23.4764477Z submodules: recursive 2025-12-04T10:32:23.4764580Z show-progress: false 2025-12-04T10:32:23.4764694Z repository: pytorch/pytorch 2025-12-04T10:32:23.4764847Z token: *** 2025-12-04T10:32:23.4764936Z ssh-strict: true 2025-12-04T10:32:23.4765032Z ssh-user: git 2025-12-04T10:32:23.4765128Z persist-credentials: true 2025-12-04T10:32:23.4765237Z clean: true 2025-12-04T10:32:23.4765360Z sparse-checkout-cone-mode: true 2025-12-04T10:32:23.4765475Z fetch-tags: false 2025-12-04T10:32:23.4765568Z lfs: false 2025-12-04T10:32:23.4765657Z set-safe-directory: true 2025-12-04T10:32:23.4765761Z env: 2025-12-04T10:32:23.4765856Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:23.4765954Z ##[endgroup] 2025-12-04T10:32:23.5216619Z Syncing repository: pytorch/pytorch 2025-12-04T10:32:23.5216918Z ##[group]Getting Git version info 2025-12-04T10:32:23.5217087Z Working directory is '/home/runner/_work/pytorch/pytorch' 2025-12-04T10:32:23.5229513Z [command]/usr/bin/git version 2025-12-04T10:32:23.5249687Z git version 2.52.0 2025-12-04T10:32:23.5260356Z ##[endgroup] 2025-12-04T10:32:23.5264272Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/08e0f1a2-b75d-4e90-b750-dafa61b62478/.gitconfig' 2025-12-04T10:32:23.5269222Z Temporarily overriding HOME='/home/runner/_work/_temp/08e0f1a2-b75d-4e90-b750-dafa61b62478' before making global git config changes 2025-12-04T10:32:23.5269615Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T10:32:23.5276707Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T10:32:23.5296960Z [command]/usr/bin/git config --local --get remote.origin.url 2025-12-04T10:32:23.5314186Z https://github.com/pytorch/pytorch 2025-12-04T10:32:23.5321024Z ##[group]Removing previously created refs, to avoid conflicts 2025-12-04T10:32:23.5322886Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD 2025-12-04T10:32:23.5337062Z HEAD 2025-12-04T10:32:23.5363888Z ##[endgroup] 2025-12-04T10:32:23.5365688Z [command]/usr/bin/git submodule status 2025-12-04T10:32:23.5563850Z 7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe) 2025-12-04T10:32:23.5610903Z 4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081) 2025-12-04T10:32:23.5652909Z b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327) 2025-12-04T10:32:23.5711024Z c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0) 2025-12-04T10:32:23.5746584Z 3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93) 2025-12-04T10:32:23.5798006Z 1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600) 2025-12-04T10:32:23.6083939Z 51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656) 2025-12-04T10:32:23.6129251Z 01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101) 2025-12-04T10:32:23.6146682Z 299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3) 2025-12-04T10:32:23.6210382Z 7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d) 2025-12-04T10:32:23.6301056Z 89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0) 2025-12-04T10:32:23.6376575Z f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30) 2025-12-04T10:32:23.6400104Z 0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c) 2025-12-04T10:32:23.6462649Z f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1) 2025-12-04T10:32:23.6495860Z c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39) 2025-12-04T10:32:23.6546891Z 979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4) 2025-12-04T10:32:23.6560773Z a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23) 2025-12-04T10:32:23.6800702Z 407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0) 2025-12-04T10:32:23.6873595Z 3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17) 2025-12-04T10:32:23.6956279Z 54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0) 2025-12-04T10:32:23.7099987Z 52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108) 2025-12-04T10:32:23.7148580Z 719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1) 2025-12-04T10:32:23.7197018Z dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5) 2025-12-04T10:32:23.7323014Z 31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main) 2025-12-04T10:32:23.7348940Z d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0) 2025-12-04T10:32:23.7370821Z fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4) 2025-12-04T10:32:23.7393858Z 55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0) 2025-12-04T10:32:23.7600719Z e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0) 2025-12-04T10:32:23.7625023Z a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2) 2025-12-04T10:32:23.7647869Z 0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5) 2025-12-04T10:32:23.7864276Z d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b) 2025-12-04T10:32:23.7904783Z 072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master) 2025-12-04T10:32:23.7940554Z 4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e) 2025-12-04T10:32:23.7964620Z f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1) 2025-12-04T10:32:23.8027602Z f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated) 2025-12-04T10:32:23.8076100Z 5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8) 2025-12-04T10:32:23.8129202Z 2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main) 2025-12-04T10:32:23.8139635Z ##[group]Cleaning the repository 2025-12-04T10:32:23.8143808Z [command]/usr/bin/git clean -ffdx 2025-12-04T10:32:23.8259071Z [command]/usr/bin/git reset --hard HEAD 2025-12-04T10:32:23.9075315Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T10:32:23.9139844Z ##[endgroup] 2025-12-04T10:32:23.9142347Z ##[group]Disabling automatic garbage collection 2025-12-04T10:32:23.9147226Z [command]/usr/bin/git config --local gc.auto 0 2025-12-04T10:32:23.9173619Z ##[endgroup] 2025-12-04T10:32:23.9173982Z ##[group]Setting up auth 2025-12-04T10:32:23.9176342Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T10:32:23.9196035Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T10:32:23.9377333Z Entering 'android/libs/fbjni' 2025-12-04T10:32:23.9406117Z Entering 'third_party/FP16' 2025-12-04T10:32:23.9429190Z Entering 'third_party/FXdiv' 2025-12-04T10:32:23.9457389Z Entering 'third_party/NNPACK' 2025-12-04T10:32:23.9484326Z Entering 'third_party/NVTX' 2025-12-04T10:32:23.9541382Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:23.9565292Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:23.9594221Z Entering 'third_party/aiter' 2025-12-04T10:32:23.9618399Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:23.9644246Z Entering 'third_party/benchmark' 2025-12-04T10:32:23.9668079Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:23.9693621Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:23.9717921Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:23.9740982Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:23.9761579Z Entering 'third_party/cutlass' 2025-12-04T10:32:23.9786466Z Entering 'third_party/fbgemm' 2025-12-04T10:32:23.9810213Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:23.9836706Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:23.9860286Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:23.9900596Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:23.9928025Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:23.9954665Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:23.9976947Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:24.0003073Z Entering 'third_party/flash-attention' 2025-12-04T10:32:24.0024302Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:24.0048717Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:24.0075798Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:24.0096105Z Entering 'third_party/fmt' 2025-12-04T10:32:24.0126149Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:24.0155454Z Entering 'third_party/gloo' 2025-12-04T10:32:24.0181673Z Entering 'third_party/googletest' 2025-12-04T10:32:24.0202356Z Entering 'third_party/ideep' 2025-12-04T10:32:24.0222780Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:24.0247546Z Entering 'third_party/ittapi' 2025-12-04T10:32:24.0268723Z Entering 'third_party/kineto' 2025-12-04T10:32:24.0293345Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:24.0322755Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:24.0353166Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:24.0374547Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:24.0394467Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:24.0415357Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:24.0441594Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:24.0480392Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:24.0516770Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:24.0544107Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:24.0566849Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:24.0588876Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:24.0613301Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:24.0650640Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:24.0672088Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:24.0694457Z Entering 'third_party/kleidiai' 2025-12-04T10:32:24.0717010Z Entering 'third_party/mimalloc' 2025-12-04T10:32:24.0739857Z Entering 'third_party/nlohmann' 2025-12-04T10:32:24.0762332Z Entering 'third_party/onnx' 2025-12-04T10:32:24.0787922Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:24.0815878Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:24.0836502Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:24.0865583Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:24.0887898Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:24.0913730Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:24.0936601Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:24.0956493Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:24.0977156Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:24.0997802Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:24.1020568Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:24.1048769Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:24.1079319Z Entering 'third_party/pocketfft' 2025-12-04T10:32:24.1102060Z Entering 'third_party/protobuf' 2025-12-04T10:32:24.1131403Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:24.1155171Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:24.1183725Z Entering 'third_party/psimd' 2025-12-04T10:32:24.1205884Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:24.1228378Z Entering 'third_party/pybind11' 2025-12-04T10:32:24.1250324Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:24.1270443Z Entering 'third_party/sleef' 2025-12-04T10:32:24.1294194Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:24.1320133Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:24.1343862Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:24.1363838Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:24.1382609Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:24.1403186Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:24.1444357Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T10:32:24.1459771Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1467172Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T10:32:24.1489407Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T10:32:24.1667436Z Entering 'android/libs/fbjni' 2025-12-04T10:32:24.1678557Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1699942Z Entering 'third_party/FP16' 2025-12-04T10:32:24.1713412Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1738183Z Entering 'third_party/FXdiv' 2025-12-04T10:32:24.1751822Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1769030Z Entering 'third_party/NNPACK' 2025-12-04T10:32:24.1782117Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1798841Z Entering 'third_party/NVTX' 2025-12-04T10:32:24.1811609Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1831327Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:24.1843845Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1861552Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:24.1872997Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1895643Z Entering 'third_party/aiter' 2025-12-04T10:32:24.1907928Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1925334Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:24.1936831Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1963984Z Entering 'third_party/benchmark' 2025-12-04T10:32:24.1976490Z http.https://github.com/.extraheader 2025-12-04T10:32:24.1995705Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:24.2007823Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2029965Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:24.2047271Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2073479Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:24.2089704Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2111637Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:24.2124746Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2148691Z Entering 'third_party/cutlass' 2025-12-04T10:32:24.2161684Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2181003Z Entering 'third_party/fbgemm' 2025-12-04T10:32:24.2194551Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2211999Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:24.2224747Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2248257Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:24.2264803Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2289052Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:24.2307803Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2335380Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:24.2361605Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2389910Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:24.2408468Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2428614Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:24.2443477Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2465878Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:24.2478633Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2498186Z Entering 'third_party/flash-attention' 2025-12-04T10:32:24.2509846Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2526873Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:24.2543344Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2565720Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:24.2578924Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2606046Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:24.2621185Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2640715Z Entering 'third_party/fmt' 2025-12-04T10:32:24.2652379Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2668111Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:24.2680572Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2696150Z Entering 'third_party/gloo' 2025-12-04T10:32:24.2710089Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2726535Z Entering 'third_party/googletest' 2025-12-04T10:32:24.2739410Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2754668Z Entering 'third_party/ideep' 2025-12-04T10:32:24.2767650Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2785918Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:24.2799007Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2820387Z Entering 'third_party/ittapi' 2025-12-04T10:32:24.2836988Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2853813Z Entering 'third_party/kineto' 2025-12-04T10:32:24.2867581Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2884354Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:24.2897464Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2914492Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:24.2929449Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2945482Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:24.2956812Z http.https://github.com/.extraheader 2025-12-04T10:32:24.2973145Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:24.2984362Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3003950Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:24.3014719Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3030536Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:24.3046499Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3068310Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:24.3083142Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3099708Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:24.3111363Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3127774Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:24.3138767Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3156006Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:24.3171451Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3188578Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:24.3199665Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3214785Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:24.3228367Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3246114Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:24.3258905Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3277920Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:24.3289609Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3304961Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:24.3317546Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3341004Z Entering 'third_party/kleidiai' 2025-12-04T10:32:24.3353905Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3370788Z Entering 'third_party/mimalloc' 2025-12-04T10:32:24.3385092Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3409851Z Entering 'third_party/nlohmann' 2025-12-04T10:32:24.3423677Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3443514Z Entering 'third_party/onnx' 2025-12-04T10:32:24.3459541Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3482563Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:24.3509843Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3532975Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:24.3546366Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3563436Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:24.3577720Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3594327Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:24.3607463Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3623034Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:24.3643724Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3658245Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:24.3671576Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3689785Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:24.3702621Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3726497Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:24.3738900Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3754061Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:24.3765857Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3784066Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:24.3796503Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3815374Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:24.3830664Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3847740Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:24.3859637Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3888387Z Entering 'third_party/pocketfft' 2025-12-04T10:32:24.3900529Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3917452Z Entering 'third_party/protobuf' 2025-12-04T10:32:24.3933015Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3951183Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:24.3963601Z http.https://github.com/.extraheader 2025-12-04T10:32:24.3984431Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:24.3997438Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4018310Z Entering 'third_party/psimd' 2025-12-04T10:32:24.4031457Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4052545Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:24.4065756Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4083801Z Entering 'third_party/pybind11' 2025-12-04T10:32:24.4096701Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4112400Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:24.4125614Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4142820Z Entering 'third_party/sleef' 2025-12-04T10:32:24.4157462Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4178962Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:24.4193235Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4208816Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:24.4224954Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4240943Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:24.4259116Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4279533Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:24.4293285Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4310290Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:24.4323812Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4340875Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:24.4352682Z http.https://github.com/.extraheader 2025-12-04T10:32:24.4390996Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.4408727Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T10:32:24.4558243Z Entering 'android/libs/fbjni' 2025-12-04T10:32:24.4567980Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T10:32:24.4576996Z Entering 'third_party/FP16' 2025-12-04T10:32:24.4587169Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T10:32:24.4595711Z Entering 'third_party/FXdiv' 2025-12-04T10:32:24.4608415Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T10:32:24.4617136Z Entering 'third_party/NNPACK' 2025-12-04T10:32:24.4630478Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T10:32:24.4640459Z Entering 'third_party/NVTX' 2025-12-04T10:32:24.4649747Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T10:32:24.4658348Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:24.4668764Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T10:32:24.4677397Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:24.4687540Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T10:32:24.4701603Z Entering 'third_party/aiter' 2025-12-04T10:32:24.4712114Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T10:32:24.4726152Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:24.4735956Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T10:32:24.4748739Z Entering 'third_party/benchmark' 2025-12-04T10:32:24.4759321Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:24.4771997Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:24.4781959Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T10:32:24.4793702Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:24.4803979Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T10:32:24.4812370Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:24.4822196Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T10:32:24.4831237Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:24.4842312Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T10:32:24.4850854Z Entering 'third_party/cutlass' 2025-12-04T10:32:24.4860330Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T10:32:24.4872493Z Entering 'third_party/fbgemm' 2025-12-04T10:32:24.4882544Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T10:32:24.4892565Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:24.4902401Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T10:32:24.4910349Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:24.4919775Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T10:32:24.4931551Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:24.4940682Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T10:32:24.4953404Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:24.4964229Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T10:32:24.4982409Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:24.5001926Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T10:32:24.5011650Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:24.5021194Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T10:32:24.5029692Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:24.5039252Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T10:32:24.5050278Z Entering 'third_party/flash-attention' 2025-12-04T10:32:24.5061077Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T10:32:24.5070483Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:24.5082641Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T10:32:24.5095116Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:24.5103796Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T10:32:24.5116398Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:24.5126766Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T10:32:24.5136658Z Entering 'third_party/fmt' 2025-12-04T10:32:24.5147411Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T10:32:24.5168469Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:24.5179854Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T10:32:24.5187866Z Entering 'third_party/gloo' 2025-12-04T10:32:24.5198504Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T10:32:24.5208927Z Entering 'third_party/googletest' 2025-12-04T10:32:24.5219966Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:24.5243326Z Entering 'third_party/ideep' 2025-12-04T10:32:24.5243644Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T10:32:24.5251917Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:24.5262535Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T10:32:24.5280127Z Entering 'third_party/ittapi' 2025-12-04T10:32:24.5290760Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T10:32:24.5299956Z Entering 'third_party/kineto' 2025-12-04T10:32:24.5310870Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T10:32:24.5323146Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:24.5335830Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T10:32:24.5344050Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:24.5353106Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T10:32:24.5363110Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:24.5377907Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T10:32:24.5388848Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:24.5407979Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T10:32:24.5418338Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:24.5429051Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T10:32:24.5437188Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:24.5452377Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T10:32:24.5466073Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:24.5474982Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T10:32:24.5483823Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:24.5492724Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:24.5501576Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:24.5510910Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T10:32:24.5519962Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:24.5531381Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T10:32:24.5538548Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:24.5547241Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T10:32:24.5555539Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:24.5564194Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T10:32:24.5574625Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:24.5584130Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T10:32:24.5596276Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:24.5605104Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T10:32:24.5613554Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:24.5624181Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T10:32:24.5636243Z Entering 'third_party/kleidiai' 2025-12-04T10:32:24.5646871Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T10:32:24.5658831Z Entering 'third_party/mimalloc' 2025-12-04T10:32:24.5668626Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T10:32:24.5677837Z Entering 'third_party/nlohmann' 2025-12-04T10:32:24.5693741Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T10:32:24.5702963Z Entering 'third_party/onnx' 2025-12-04T10:32:24.5712069Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T10:32:24.5726402Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:24.5735938Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:24.5747392Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:24.5763062Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T10:32:24.5770491Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:24.5780864Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:24.5789380Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:24.5799022Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:24.5807415Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:24.5815902Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T10:32:24.5823569Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:24.5835689Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T10:32:24.5845498Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:24.5855238Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T10:32:24.5865245Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:24.5874888Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T10:32:24.5883661Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:24.5894239Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T10:32:24.5902831Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:24.5912837Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T10:32:24.5921885Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:24.5930831Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T10:32:24.5941721Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:24.5950540Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T10:32:24.5969348Z Entering 'third_party/pocketfft' 2025-12-04T10:32:24.5979465Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T10:32:24.5991165Z Entering 'third_party/protobuf' 2025-12-04T10:32:24.6001391Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T10:32:24.6011592Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:24.6022125Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:24.6031348Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:24.6040254Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:24.6051090Z Entering 'third_party/psimd' 2025-12-04T10:32:24.6060282Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T10:32:24.6069445Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:24.6081753Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T10:32:24.6090847Z Entering 'third_party/pybind11' 2025-12-04T10:32:24.6100939Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:24.6117967Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:24.6128384Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T10:32:24.6137041Z Entering 'third_party/sleef' 2025-12-04T10:32:24.6146601Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T10:32:24.6155022Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:24.6165280Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T10:32:24.6174341Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:24.6184456Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:24.6194826Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:24.6203861Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T10:32:24.6211912Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:24.6222813Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T10:32:24.6231597Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:24.6240781Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:24.6251646Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:24.6261022Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T10:32:24.6287170Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6303336Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6318137Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6331181Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6347782Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6363567Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6377917Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6395016Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6408200Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6421434Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6434053Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6453939Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6471153Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6484169Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6498294Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6512431Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6525553Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6539957Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6558220Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6571205Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6584175Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6596637Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6609809Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6622458Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6635884Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6649290Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6662913Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6675835Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6689096Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6701848Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6714676Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6727922Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6741049Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6755067Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6770436Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6788264Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6801733Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6815905Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6829976Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6843137Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6856980Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6870681Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6891295Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6904399Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6917607Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6930588Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6944120Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6956967Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6970766Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.6983915Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7002898Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7016116Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7029424Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7042941Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7056415Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7070334Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7088379Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7102703Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7116515Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7131278Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7144673Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7158565Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7172479Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7187808Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7200769Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7214409Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7228237Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7241979Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7256147Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7276520Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7290360Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7303031Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7316694Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7330587Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7343870Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7358607Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7372462Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7386615Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7400970Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7414840Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7428492Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T10:32:24.7444859Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T10:32:24.7465419Z ##[endgroup] 2025-12-04T10:32:24.7465627Z ##[group]Fetching the repository 2025-12-04T10:32:24.7469325Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/* 2025-12-04T10:32:26.1603220Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object} 2025-12-04T10:32:26.1779457Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:26.1783622Z ##[endgroup] 2025-12-04T10:32:26.1783931Z ##[group]Determining the checkout info 2025-12-04T10:32:26.1785513Z ##[endgroup] 2025-12-04T10:32:26.1790841Z [command]/usr/bin/git sparse-checkout disable 2025-12-04T10:32:26.1884496Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig 2025-12-04T10:32:26.1906406Z ##[group]Checking out the ref 2025-12-04T10:32:26.1908242Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:26.2212218Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919) 2025-12-04T10:32:26.2217067Z ##[endgroup] 2025-12-04T10:32:26.2217305Z ##[group]Setting up auth for fetching submodules 2025-12-04T10:32:26.2220817Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic *** 2025-12-04T10:32:26.2250995Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf 2025-12-04T10:32:26.2269115Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com: 2025-12-04T10:32:26.2285435Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com: 2025-12-04T10:32:26.2299881Z ##[endgroup] 2025-12-04T10:32:26.2300074Z ##[group]Fetching submodules 2025-12-04T10:32:26.2301774Z [command]/usr/bin/git submodule sync --recursive 2025-12-04T10:32:26.2486686Z Synchronizing submodule url for 'android/libs/fbjni' 2025-12-04T10:32:26.2496711Z Synchronizing submodule url for 'third_party/FP16' 2025-12-04T10:32:26.2511387Z Synchronizing submodule url for 'third_party/FXdiv' 2025-12-04T10:32:26.2522763Z Synchronizing submodule url for 'third_party/NNPACK' 2025-12-04T10:32:26.2534069Z Synchronizing submodule url for 'third_party/NVTX' 2025-12-04T10:32:26.2545676Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:26.2557524Z Synchronizing submodule url for 'third_party/XNNPACK' 2025-12-04T10:32:26.2577908Z Synchronizing submodule url for 'third_party/aiter' 2025-12-04T10:32:26.2590917Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:26.2605313Z Synchronizing submodule url for 'third_party/benchmark' 2025-12-04T10:32:26.2615196Z Synchronizing submodule url for 'third_party/composable_kernel' 2025-12-04T10:32:26.2627862Z Synchronizing submodule url for 'third_party/cpp-httplib' 2025-12-04T10:32:26.2639405Z Synchronizing submodule url for 'third_party/cpuinfo' 2025-12-04T10:32:26.2650048Z Synchronizing submodule url for 'third_party/cudnn_frontend' 2025-12-04T10:32:26.2660695Z Synchronizing submodule url for 'third_party/cutlass' 2025-12-04T10:32:26.2681068Z Synchronizing submodule url for 'third_party/fbgemm' 2025-12-04T10:32:26.2697715Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:26.2709001Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:26.2723022Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:26.2734587Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:26.2748334Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:26.2757301Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:26.2767566Z Synchronizing submodule url for 'third_party/fbgemm/external/json' 2025-12-04T10:32:26.2780037Z Synchronizing submodule url for 'third_party/flash-attention' 2025-12-04T10:32:26.2790567Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:26.2809468Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:26.2825657Z Synchronizing submodule url for 'third_party/flatbuffers' 2025-12-04T10:32:26.2837059Z Synchronizing submodule url for 'third_party/fmt' 2025-12-04T10:32:26.2847810Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:26.2858075Z Synchronizing submodule url for 'third_party/gloo' 2025-12-04T10:32:26.2869054Z Synchronizing submodule url for 'third_party/googletest' 2025-12-04T10:32:26.2880278Z Synchronizing submodule url for 'third_party/ideep' 2025-12-04T10:32:26.2890780Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:26.2908516Z Synchronizing submodule url for 'third_party/ittapi' 2025-12-04T10:32:26.2919170Z Synchronizing submodule url for 'third_party/kineto' 2025-12-04T10:32:26.2930429Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:26.2940794Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:26.2951775Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:26.2962618Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:26.2973710Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:26.2982971Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:26.3000482Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:26.3012370Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:26.3021780Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:26.3031290Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:26.3040472Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:26.3051347Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:26.3062219Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:26.3077348Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:26.3087137Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:26.3099628Z Synchronizing submodule url for 'third_party/kleidiai' 2025-12-04T10:32:26.3110624Z Synchronizing submodule url for 'third_party/mimalloc' 2025-12-04T10:32:26.3121057Z Synchronizing submodule url for 'third_party/nlohmann' 2025-12-04T10:32:26.3131737Z Synchronizing submodule url for 'third_party/onnx' 2025-12-04T10:32:26.3148195Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:26.3171243Z Synchronizing submodule url for 'third_party/opentelemetry-cpp' 2025-12-04T10:32:26.3193741Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:26.3209783Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:26.3227032Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:26.3241004Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:26.3250321Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:26.3266078Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:26.3278064Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:26.3290632Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:26.3300530Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:26.3313500Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:26.3338629Z Synchronizing submodule url for 'third_party/pocketfft' 2025-12-04T10:32:26.3349246Z Synchronizing submodule url for 'third_party/protobuf' 2025-12-04T10:32:26.3367201Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:26.3381548Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:26.3396639Z Synchronizing submodule url for 'third_party/psimd' 2025-12-04T10:32:26.3408224Z Synchronizing submodule url for 'third_party/pthreadpool' 2025-12-04T10:32:26.3419082Z Synchronizing submodule url for 'third_party/pybind11' 2025-12-04T10:32:26.3428196Z Synchronizing submodule url for 'third_party/python-peachpy' 2025-12-04T10:32:26.3437376Z Synchronizing submodule url for 'third_party/sleef' 2025-12-04T10:32:26.3446555Z Synchronizing submodule url for 'third_party/tensorpipe' 2025-12-04T10:32:26.3456830Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:26.3466361Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:26.3477082Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:26.3487665Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:26.3498014Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:26.3530803Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive 2025-12-04T10:32:26.3755316Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f' 2025-12-04T10:32:26.3810489Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3' 2025-12-04T10:32:26.3864276Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1' 2025-12-04T10:32:26.3910168Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73' 2025-12-04T10:32:26.3981926Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6' 2025-12-04T10:32:26.4042170Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1' 2025-12-04T10:32:26.4190301Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883' 2025-12-04T10:32:26.4336640Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150' 2025-12-04T10:32:26.4513292Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf' 2025-12-04T10:32:26.4568580Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f' 2025-12-04T10:32:26.4735642Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T10:32:26.4809817Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246' 2025-12-04T10:32:26.4877470Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc' 2025-12-04T10:32:26.4944603Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396' 2025-12-04T10:32:26.5055147Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588' 2025-12-04T10:32:26.5170096Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4' 2025-12-04T10:32:26.5220382Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea' 2025-12-04T10:32:26.5429939Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977' 2025-12-04T10:32:26.5516666Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349' 2025-12-04T10:32:26.5624630Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8' 2025-12-04T10:32:26.5679567Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T10:32:26.5727727Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691' 2025-12-04T10:32:26.5809888Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03' 2025-12-04T10:32:26.5886160Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5' 2025-12-04T10:32:26.6042366Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33' 2025-12-04T10:32:26.6144740Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420' 2025-12-04T10:32:26.6229548Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757' 2025-12-04T10:32:26.6283820Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f' 2025-12-04T10:32:26.6333869Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350' 2025-12-04T10:32:26.6400944Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341' 2025-12-04T10:32:26.6453641Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T10:32:26.6504976Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3' 2025-12-04T10:32:26.6670669Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d' 2025-12-04T10:32:26.6726817Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959' 2025-12-04T10:32:26.6810458Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943' 2025-12-04T10:32:26.6889238Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1' 2025-12-04T10:32:26.6956352Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9' 2025-12-04T10:32:26.7021250Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400' 2025-12-04T10:32:26.7081158Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05' 2025-12-04T10:32:26.7133946Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067' 2025-12-04T10:32:26.7181653Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4' 2025-12-04T10:32:26.7247228Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446' 2025-12-04T10:32:26.7314360Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T10:32:26.7397121Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5' 2025-12-04T10:32:26.7447622Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150' 2025-12-04T10:32:26.7542382Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a' 2025-12-04T10:32:26.7637139Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159' 2025-12-04T10:32:26.7710644Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T10:32:26.7773637Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21' 2025-12-04T10:32:26.7827087Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723' 2025-12-04T10:32:26.7909364Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe' 2025-12-04T10:32:26.7982580Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e' 2025-12-04T10:32:26.8069036Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72' 2025-12-04T10:32:26.8216737Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83' 2025-12-04T10:32:26.8291946Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4' 2025-12-04T10:32:26.8390084Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878' 2025-12-04T10:32:26.8453183Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2' 2025-12-04T10:32:26.8511951Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1' 2025-12-04T10:32:26.8572420Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa' 2025-12-04T10:32:26.8661695Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d' 2025-12-04T10:32:26.8714063Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce' 2025-12-04T10:32:26.8766239Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5' 2025-12-04T10:32:26.8830732Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d' 2025-12-04T10:32:26.8899620Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4' 2025-12-04T10:32:26.8971682Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929' 2025-12-04T10:32:26.9127935Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50' 2025-12-04T10:32:26.9189418Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa' 2025-12-04T10:32:26.9348013Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a' 2025-12-04T10:32:26.9401380Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8' 2025-12-04T10:32:26.9465905Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081' 2025-12-04T10:32:26.9518223Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900' 2025-12-04T10:32:26.9569017Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8' 2025-12-04T10:32:26.9629932Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8' 2025-12-04T10:32:26.9677998Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67' 2025-12-04T10:32:26.9732443Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68' 2025-12-04T10:32:26.9792691Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d' 2025-12-04T10:32:26.9846967Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e' 2025-12-04T10:32:26.9891855Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281' 2025-12-04T10:32:27.0021355Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b' 2025-12-04T10:32:27.0084390Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef' 2025-12-04T10:32:27.0131164Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5' 2025-12-04T10:32:27.0154137Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0 2025-12-04T10:32:27.0362198Z Entering 'android/libs/fbjni' 2025-12-04T10:32:27.0384516Z Entering 'third_party/FP16' 2025-12-04T10:32:27.0410159Z Entering 'third_party/FXdiv' 2025-12-04T10:32:27.0431784Z Entering 'third_party/NNPACK' 2025-12-04T10:32:27.0450677Z Entering 'third_party/NVTX' 2025-12-04T10:32:27.0470299Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:27.0491033Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:27.0523585Z Entering 'third_party/aiter' 2025-12-04T10:32:27.0544176Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:27.0566373Z Entering 'third_party/benchmark' 2025-12-04T10:32:27.0586040Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:27.0606952Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:27.0628756Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:27.0647258Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:27.0667145Z Entering 'third_party/cutlass' 2025-12-04T10:32:27.0691905Z Entering 'third_party/fbgemm' 2025-12-04T10:32:27.0713249Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:27.0733768Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:27.0758425Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:27.0777461Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:27.0799973Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:27.0818731Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:27.0837396Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:27.0858458Z Entering 'third_party/flash-attention' 2025-12-04T10:32:27.0885501Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:27.0908669Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:27.0938889Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:27.0958772Z Entering 'third_party/fmt' 2025-12-04T10:32:27.0977948Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:27.1004266Z Entering 'third_party/gloo' 2025-12-04T10:32:27.1024004Z Entering 'third_party/googletest' 2025-12-04T10:32:27.1048412Z Entering 'third_party/ideep' 2025-12-04T10:32:27.1067641Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:27.1091365Z Entering 'third_party/ittapi' 2025-12-04T10:32:27.1110729Z Entering 'third_party/kineto' 2025-12-04T10:32:27.1130229Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:27.1148003Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:27.1178812Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:27.1201673Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:27.1227713Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:27.1248328Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:27.1269725Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:27.1288489Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:27.1307534Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:27.1327438Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:27.1345338Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:27.1362692Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:27.1381965Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:27.1405070Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:27.1428818Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:27.1449171Z Entering 'third_party/kleidiai' 2025-12-04T10:32:27.1467897Z Entering 'third_party/mimalloc' 2025-12-04T10:32:27.1496132Z Entering 'third_party/nlohmann' 2025-12-04T10:32:27.1515405Z Entering 'third_party/onnx' 2025-12-04T10:32:27.1552313Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:27.1574975Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:27.1595725Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:27.1615794Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:27.1643435Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:27.1663984Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:27.1683047Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:27.1701023Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:27.1720535Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:27.1745580Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:27.1765209Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:27.1784824Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:27.1811499Z Entering 'third_party/pocketfft' 2025-12-04T10:32:27.1835427Z Entering 'third_party/protobuf' 2025-12-04T10:32:27.1856602Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:27.1874806Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:27.1899825Z Entering 'third_party/psimd' 2025-12-04T10:32:27.1925956Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:27.1945588Z Entering 'third_party/pybind11' 2025-12-04T10:32:27.1970166Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:27.1990610Z Entering 'third_party/sleef' 2025-12-04T10:32:27.2026091Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:27.2060621Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:27.2093875Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:27.2121323Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:27.2152404Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:27.2176628Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:27.2215818Z ##[endgroup] 2025-12-04T10:32:27.2216125Z ##[group]Persisting credentials for submodules 2025-12-04T10:32:27.2224450Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :" 2025-12-04T10:32:27.2426847Z Entering 'android/libs/fbjni' 2025-12-04T10:32:27.2442517Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2443023Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2459878Z Entering 'third_party/FP16' 2025-12-04T10:32:27.2475276Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2475562Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2499631Z Entering 'third_party/FXdiv' 2025-12-04T10:32:27.2517741Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2517923Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2536349Z Entering 'third_party/NNPACK' 2025-12-04T10:32:27.2553822Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2553980Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2571669Z Entering 'third_party/NVTX' 2025-12-04T10:32:27.2584778Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2584952Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2601846Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:27.2616395Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2616695Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2637338Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:27.2651710Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2651915Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2680469Z Entering 'third_party/aiter' 2025-12-04T10:32:27.2694447Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2694632Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2715022Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:27.2731521Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2731694Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2754611Z Entering 'third_party/benchmark' 2025-12-04T10:32:27.2771728Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2771885Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2794082Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:27.2808459Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2808607Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2830835Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:27.2844619Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2844750Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2861965Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:27.2875147Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2875507Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2895855Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:27.2911896Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2912033Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2930270Z Entering 'third_party/cutlass' 2025-12-04T10:32:27.2942441Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2942721Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2971574Z Entering 'third_party/fbgemm' 2025-12-04T10:32:27.2986732Z url.https://github.com/.insteadof 2025-12-04T10:32:27.2986859Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3012947Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:27.3029695Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3029881Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3057860Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:27.3070673Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3070813Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3094932Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:27.3108598Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3108722Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3127931Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:27.3141318Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3141568Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3160726Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:27.3172726Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3172903Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3194666Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:27.3210498Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3210679Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3228221Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:27.3241313Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3241498Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3259905Z Entering 'third_party/flash-attention' 2025-12-04T10:32:27.3278245Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3278413Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3296074Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:27.3310674Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3311161Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3326990Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:27.3338237Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3338508Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3359887Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:27.3372570Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3372792Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3391689Z Entering 'third_party/fmt' 2025-12-04T10:32:27.3404554Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3404762Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3427179Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:27.3440394Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3440588Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3456657Z Entering 'third_party/gloo' 2025-12-04T10:32:27.3470519Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3470706Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3494715Z Entering 'third_party/googletest' 2025-12-04T10:32:27.3507729Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3507916Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3527267Z Entering 'third_party/ideep' 2025-12-04T10:32:27.3540356Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3540526Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3558335Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:27.3573507Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3573675Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3595667Z Entering 'third_party/ittapi' 2025-12-04T10:32:27.3609296Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3609559Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3625629Z Entering 'third_party/kineto' 2025-12-04T10:32:27.3639048Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3639247Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3655907Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:27.3667858Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3668157Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3685259Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:27.3697561Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3697729Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3715094Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:27.3731882Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3732206Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3749048Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:27.3762934Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3763230Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3780225Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:27.3791936Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3792110Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3807869Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:27.3820185Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3820346Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3836745Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:27.3848619Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3848771Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3865230Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:27.3878593Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3878746Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3903751Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:27.3915247Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3915402Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3932180Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:27.3944535Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3944688Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3965978Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:27.3979931Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3980077Z url.https://github.com/.insteadof 2025-12-04T10:32:27.3998395Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:27.4013908Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4014247Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4033468Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:27.4052991Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4053120Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4074385Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:27.4086183Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4086325Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4104562Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:27.4117193Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4117322Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4136225Z Entering 'third_party/kleidiai' 2025-12-04T10:32:27.4148439Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4148569Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4164904Z Entering 'third_party/mimalloc' 2025-12-04T10:32:27.4177602Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4177885Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4195450Z Entering 'third_party/nlohmann' 2025-12-04T10:32:27.4206987Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4207201Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4237739Z Entering 'third_party/onnx' 2025-12-04T10:32:27.4239555Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4263554Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4263849Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:27.4277903Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4278142Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4301301Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:27.4315892Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4316041Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4336510Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:27.4348508Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4348665Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4364644Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:27.4378419Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4378566Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4394235Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:27.4407014Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4407167Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4423228Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:27.4436916Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4437061Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4454636Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:27.4467764Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4467914Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4482134Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:27.4492540Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4492669Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4511290Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:27.4523763Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4523890Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4538684Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:27.4551130Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4551253Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4571959Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:27.4584491Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4584618Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4604530Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:27.4616725Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4617042Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4643085Z Entering 'third_party/pocketfft' 2025-12-04T10:32:27.4656726Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4656848Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4676966Z Entering 'third_party/protobuf' 2025-12-04T10:32:27.4689148Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4689269Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4707318Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:27.4722850Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4722974Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4737169Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:27.4749904Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4750028Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4769308Z Entering 'third_party/psimd' 2025-12-04T10:32:27.4781794Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4781908Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4798824Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:27.4810071Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4810189Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4825901Z Entering 'third_party/pybind11' 2025-12-04T10:32:27.4837742Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4837863Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4853695Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:27.4866620Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4866745Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4883029Z Entering 'third_party/sleef' 2025-12-04T10:32:27.4895720Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4895846Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4915374Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:27.4931972Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4932097Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4948606Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:27.4961231Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4961359Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4980687Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:27.4992814Z url.https://github.com/.insteadof 2025-12-04T10:32:27.4992940Z url.https://github.com/.insteadof 2025-12-04T10:32:27.5014275Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:27.5030540Z url.https://github.com/.insteadof 2025-12-04T10:32:27.5030672Z url.https://github.com/.insteadof 2025-12-04T10:32:27.5046551Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:27.5059357Z url.https://github.com/.insteadof 2025-12-04T10:32:27.5059483Z url.https://github.com/.insteadof 2025-12-04T10:32:27.5074575Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:27.5089116Z url.https://github.com/.insteadof 2025-12-04T10:32:27.5089239Z url.https://github.com/.insteadof 2025-12-04T10:32:27.5127674Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url" 2025-12-04T10:32:27.5294564Z Entering 'android/libs/fbjni' 2025-12-04T10:32:27.5313948Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T10:32:27.5323948Z Entering 'third_party/FP16' 2025-12-04T10:32:27.5348101Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T10:32:27.5358974Z Entering 'third_party/FXdiv' 2025-12-04T10:32:27.5379110Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T10:32:27.5388864Z Entering 'third_party/NNPACK' 2025-12-04T10:32:27.5408228Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T10:32:27.5418212Z Entering 'third_party/NVTX' 2025-12-04T10:32:27.5436493Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T10:32:27.5450990Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:27.5470785Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T10:32:27.5480669Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:27.5499371Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T10:32:27.5516179Z Entering 'third_party/aiter' 2025-12-04T10:32:27.5534349Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T10:32:27.5545823Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:27.5564962Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T10:32:27.5579320Z Entering 'third_party/benchmark' 2025-12-04T10:32:27.5601032Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:27.5610665Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:27.5631430Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T10:32:27.5644085Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:27.5664160Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T10:32:27.5676799Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:27.5701611Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T10:32:27.5713465Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:27.5733222Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T10:32:27.5743201Z Entering 'third_party/cutlass' 2025-12-04T10:32:27.5764261Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T10:32:27.5777333Z Entering 'third_party/fbgemm' 2025-12-04T10:32:27.5796563Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T10:32:27.5807127Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:27.5843645Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T10:32:27.5855898Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:27.5877556Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T10:32:27.5890184Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:27.5909907Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T10:32:27.5919838Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:27.5939725Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T10:32:27.5951608Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:27.5973035Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T10:32:27.5982090Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:27.6001486Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T10:32:27.6014832Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:27.6037873Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T10:32:27.6049922Z Entering 'third_party/flash-attention' 2025-12-04T10:32:27.6069665Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T10:32:27.6079644Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:27.6098533Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T10:32:27.6109972Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:27.6129334Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T10:32:27.6144045Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:27.6161768Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T10:32:27.6173351Z Entering 'third_party/fmt' 2025-12-04T10:32:27.6195369Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T10:32:27.6205216Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:27.6224344Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T10:32:27.6234530Z Entering 'third_party/gloo' 2025-12-04T10:32:27.6254240Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T10:32:27.6269987Z Entering 'third_party/googletest' 2025-12-04T10:32:27.6299504Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:27.6309923Z Entering 'third_party/ideep' 2025-12-04T10:32:27.6328675Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T10:32:27.6338044Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:27.6378839Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T10:32:27.6392390Z Entering 'third_party/ittapi' 2025-12-04T10:32:27.6416858Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T10:32:27.6426815Z Entering 'third_party/kineto' 2025-12-04T10:32:27.6443486Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T10:32:27.6454382Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:27.6476291Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T10:32:27.6486653Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:27.6511114Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T10:32:27.6521698Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:27.6539763Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T10:32:27.6548941Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:27.6568424Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T10:32:27.6577576Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:27.6602196Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T10:32:27.6615927Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:27.6641167Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T10:32:27.6653078Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:27.6675442Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T10:32:27.6684665Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:27.6702887Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:27.6714979Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:27.6736951Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T10:32:27.6746937Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:27.6765800Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T10:32:27.6775798Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:27.6796765Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T10:32:27.6806173Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:27.6830134Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T10:32:27.6844785Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:27.6863196Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T10:32:27.6878804Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:27.6898200Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T10:32:27.6907572Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:27.6926616Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T10:32:27.6938081Z Entering 'third_party/kleidiai' 2025-12-04T10:32:27.6976592Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T10:32:27.6988829Z Entering 'third_party/mimalloc' 2025-12-04T10:32:27.7008000Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T10:32:27.7017966Z Entering 'third_party/nlohmann' 2025-12-04T10:32:27.7040173Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T10:32:27.7048101Z Entering 'third_party/onnx' 2025-12-04T10:32:27.7067045Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T10:32:27.7082771Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:27.7104452Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:27.7116961Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:27.7137108Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T10:32:27.7146654Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:27.7167139Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:27.7175298Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:27.7194552Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:27.7203491Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:27.7226576Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T10:32:27.7239683Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:27.7281265Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T10:32:27.7290934Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:27.7310637Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T10:32:27.7319412Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:27.7339561Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T10:32:27.7349232Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:27.7367593Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T10:32:27.7374840Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:27.7392313Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T10:32:27.7402095Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:27.7427302Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T10:32:27.7438524Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:27.7459140Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T10:32:27.7475935Z Entering 'third_party/pocketfft' 2025-12-04T10:32:27.7498818Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T10:32:27.7514094Z Entering 'third_party/protobuf' 2025-12-04T10:32:27.7533880Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T10:32:27.7543700Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:27.7562442Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T10:32:27.7571925Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:27.7589191Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:27.7607128Z Entering 'third_party/psimd' 2025-12-04T10:32:27.7627081Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T10:32:27.7637220Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:27.7662019Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T10:32:27.7672236Z Entering 'third_party/pybind11' 2025-12-04T10:32:27.7693102Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:27.7703822Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:27.7726384Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T10:32:27.7736353Z Entering 'third_party/sleef' 2025-12-04T10:32:27.7757135Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T10:32:27.7767242Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:27.7785322Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T10:32:27.7795817Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:27.7814763Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T10:32:27.7823790Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:27.7843087Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T10:32:27.7852974Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:27.7873041Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T10:32:27.7882958Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:27.7912127Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T10:32:27.7921857Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:27.7943002Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T10:32:27.8165297Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:' 2025-12-04T10:32:27.8325824Z Entering 'android/libs/fbjni' 2025-12-04T10:32:27.8349715Z Entering 'third_party/FP16' 2025-12-04T10:32:27.8369331Z Entering 'third_party/FXdiv' 2025-12-04T10:32:27.8390307Z Entering 'third_party/NNPACK' 2025-12-04T10:32:27.8409134Z Entering 'third_party/NVTX' 2025-12-04T10:32:27.8428188Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:27.8446222Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:27.8471669Z Entering 'third_party/aiter' 2025-12-04T10:32:27.8492687Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:27.8534056Z Entering 'third_party/benchmark' 2025-12-04T10:32:27.8559454Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:27.8585083Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:27.8605286Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:27.8623720Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:27.8641958Z Entering 'third_party/cutlass' 2025-12-04T10:32:27.8663955Z Entering 'third_party/fbgemm' 2025-12-04T10:32:27.8684257Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:27.8703357Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:27.8726030Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:27.8745849Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:27.8767879Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:27.8786294Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:27.8805426Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:27.8825224Z Entering 'third_party/flash-attention' 2025-12-04T10:32:27.8847893Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:27.8873890Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:27.8896891Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:27.8917454Z Entering 'third_party/fmt' 2025-12-04T10:32:27.8938314Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:27.8962631Z Entering 'third_party/gloo' 2025-12-04T10:32:27.8981945Z Entering 'third_party/googletest' 2025-12-04T10:32:27.9001652Z Entering 'third_party/ideep' 2025-12-04T10:32:27.9021302Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:27.9045225Z Entering 'third_party/ittapi' 2025-12-04T10:32:27.9066326Z Entering 'third_party/kineto' 2025-12-04T10:32:27.9086136Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:27.9105069Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:27.9123482Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:27.9140547Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:27.9158535Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:27.9181532Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:27.9220315Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:27.9244083Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:27.9264503Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:27.9285399Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:27.9303303Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:27.9327053Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:27.9347236Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:27.9373025Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:27.9391676Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:27.9416391Z Entering 'third_party/kleidiai' 2025-12-04T10:32:27.9437851Z Entering 'third_party/mimalloc' 2025-12-04T10:32:27.9457454Z Entering 'third_party/nlohmann' 2025-12-04T10:32:27.9476660Z Entering 'third_party/onnx' 2025-12-04T10:32:27.9506924Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:27.9529893Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:27.9549373Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:27.9567017Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:27.9588118Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:27.9606223Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:27.9624561Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:27.9642988Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:27.9661893Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:27.9681097Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:27.9700857Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:27.9723230Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:27.9753931Z Entering 'third_party/pocketfft' 2025-12-04T10:32:27.9773505Z Entering 'third_party/protobuf' 2025-12-04T10:32:27.9793830Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:27.9818755Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:27.9843558Z Entering 'third_party/psimd' 2025-12-04T10:32:27.9863195Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:27.9891114Z Entering 'third_party/pybind11' 2025-12-04T10:32:27.9915683Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:27.9935777Z Entering 'third_party/sleef' 2025-12-04T10:32:27.9955575Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:27.9984002Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:28.0001567Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:28.0020125Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:28.0040766Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:28.0059385Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:28.0108837Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:' 2025-12-04T10:32:28.0276201Z Entering 'android/libs/fbjni' 2025-12-04T10:32:28.0299768Z Entering 'third_party/FP16' 2025-12-04T10:32:28.0326598Z Entering 'third_party/FXdiv' 2025-12-04T10:32:28.0347982Z Entering 'third_party/NNPACK' 2025-12-04T10:32:28.0367884Z Entering 'third_party/NVTX' 2025-12-04T10:32:28.0388038Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T10:32:28.0411228Z Entering 'third_party/XNNPACK' 2025-12-04T10:32:28.0443836Z Entering 'third_party/aiter' 2025-12-04T10:32:28.0465251Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T10:32:28.0491419Z Entering 'third_party/benchmark' 2025-12-04T10:32:28.0511062Z Entering 'third_party/composable_kernel' 2025-12-04T10:32:28.0533356Z Entering 'third_party/cpp-httplib' 2025-12-04T10:32:28.0552768Z Entering 'third_party/cpuinfo' 2025-12-04T10:32:28.0573527Z Entering 'third_party/cudnn_frontend' 2025-12-04T10:32:28.0592991Z Entering 'third_party/cutlass' 2025-12-04T10:32:28.0618102Z Entering 'third_party/fbgemm' 2025-12-04T10:32:28.0640199Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T10:32:28.0659043Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T10:32:28.0681776Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T10:32:28.0700104Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T10:32:28.0722671Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T10:32:28.0741675Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T10:32:28.0759773Z Entering 'third_party/fbgemm/external/json' 2025-12-04T10:32:28.0779901Z Entering 'third_party/flash-attention' 2025-12-04T10:32:28.0799802Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T10:32:28.0819943Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T10:32:28.0842997Z Entering 'third_party/flatbuffers' 2025-12-04T10:32:28.0864196Z Entering 'third_party/fmt' 2025-12-04T10:32:28.0886615Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T10:32:28.0910198Z Entering 'third_party/gloo' 2025-12-04T10:32:28.0933649Z Entering 'third_party/googletest' 2025-12-04T10:32:28.0951662Z Entering 'third_party/ideep' 2025-12-04T10:32:28.0978311Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T10:32:28.1001706Z Entering 'third_party/ittapi' 2025-12-04T10:32:28.1020691Z Entering 'third_party/kineto' 2025-12-04T10:32:28.1043427Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T10:32:28.1062760Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T10:32:28.1081796Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T10:32:28.1099669Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T10:32:28.1120096Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T10:32:28.1138866Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T10:32:28.1160488Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T10:32:28.1178836Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T10:32:28.1196600Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T10:32:28.1215082Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T10:32:28.1232923Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T10:32:28.1251137Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:28.1272312Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:28.1298123Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T10:32:28.1322356Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T10:32:28.1346486Z Entering 'third_party/kleidiai' 2025-12-04T10:32:28.1370028Z Entering 'third_party/mimalloc' 2025-12-04T10:32:28.1389859Z Entering 'third_party/nlohmann' 2025-12-04T10:32:28.1410866Z Entering 'third_party/onnx' 2025-12-04T10:32:28.1439765Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T10:32:28.1466831Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T10:32:28.1487543Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T10:32:28.1505497Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T10:32:28.1522532Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T10:32:28.1539761Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T10:32:28.1558244Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T10:32:28.1575575Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T10:32:28.1594178Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T10:32:28.1613010Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T10:32:28.1648771Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T10:32:28.1672311Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T10:32:28.1698864Z Entering 'third_party/pocketfft' 2025-12-04T10:32:28.1717394Z Entering 'third_party/protobuf' 2025-12-04T10:32:28.1743676Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T10:32:28.1763683Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T10:32:28.1784767Z Entering 'third_party/psimd' 2025-12-04T10:32:28.1803707Z Entering 'third_party/pthreadpool' 2025-12-04T10:32:28.1822974Z Entering 'third_party/pybind11' 2025-12-04T10:32:28.1843055Z Entering 'third_party/python-peachpy' 2025-12-04T10:32:28.1864215Z Entering 'third_party/sleef' 2025-12-04T10:32:28.1884915Z Entering 'third_party/tensorpipe' 2025-12-04T10:32:28.1903899Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T10:32:28.1924832Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T10:32:28.1950492Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T10:32:28.1970924Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T10:32:28.1988469Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T10:32:28.2020076Z ##[endgroup] 2025-12-04T10:32:28.2450816Z [command]/usr/bin/git log -1 --format=%H 2025-12-04T10:32:28.2936957Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:28.3070838Z Prepare all required actions 2025-12-04T10:32:28.3071168Z Getting action download info 2025-12-04T10:32:28.5460759Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076) 2025-12-04T10:32:29.4359857Z ##[group]Run ./.github/actions/setup-rocm 2025-12-04T10:32:29.4359993Z env: 2025-12-04T10:32:29.4360077Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.4360188Z ##[endgroup] 2025-12-04T10:32:29.4371296Z ##[group]Run dpkg -l | grep -E " rocm" 2025-12-04T10:32:29.4371430Z dpkg -l | grep -E " rocm" 2025-12-04T10:32:29.4374660Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.4374798Z env: 2025-12-04T10:32:29.4374881Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.4374982Z ##[endgroup] 2025-12-04T10:32:29.4435158Z ii rocm-cmake 0.14.0.60401-83~22.04 amd64 rocm-cmake built using CMake 2025-12-04T10:32:29.4435626Z ii rocm-core 6.4.1.60401-83~22.04 amd64 ROCm Runtime software stack 2025-12-04T10:32:29.4436008Z ii rocm-dbgapi 0.77.2.60401-83~22.04 amd64 Library to provide AMD GPU debugger API 2025-12-04T10:32:29.4436428Z ii rocm-debug-agent 2.0.4.60401-83~22.04 amd64 Radeon Open Compute Debug Agent (ROCdebug-agent) 2025-12-04T10:32:29.4437212Z ii rocm-dev 6.4.1.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-12-04T10:32:29.4437653Z ii rocm-device-libs 1.0.0.60401-83~22.04 amd64 Radeon Open Compute - device libraries 2025-12-04T10:32:29.4438013Z ii rocm-gdb 15.2.60401-83~22.04 amd64 ROCgdb 2025-12-04T10:32:29.4438342Z ii rocm-llvm 19.0.0.25184.60401-83~22.04 amd64 ROCm core compiler 2025-12-04T10:32:29.4438690Z ii rocm-opencl 2.0.0.60401-83~22.04 amd64 clr built using CMake 2025-12-04T10:32:29.4439033Z ii rocm-opencl-dev 2.0.0.60401-83~22.04 amd64 clr built using CMake 2025-12-04T10:32:29.4439391Z ii rocm-smi-lib 7.5.0.60401-83~22.04 amd64 AMD System Management libraries 2025-12-04T10:32:29.4439858Z ii rocm-utils 6.4.1.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime software stack 2025-12-04T10:32:29.4440256Z ii rocminfo 1.0.0.60401-83~22.04 amd64 Radeon Open Compute (ROCm) Runtime rocminfo tool 2025-12-04T10:32:29.4458325Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T10:32:29.4458647Z # ignore expansion of "docker ps -q" since it could be empty 2025-12-04T10:32:29.4458837Z # shellcheck disable=SC2046 2025-12-04T10:32:29.4458995Z docker stop $(docker ps -q) || true 2025-12-04T10:32:29.4459145Z # Prune all stopped containers. 2025-12-04T10:32:29.4459290Z docker container prune -f 2025-12-04T10:32:29.4463731Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.4463875Z env: 2025-12-04T10:32:29.4463965Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.4464071Z ##[endgroup] 2025-12-04T10:32:29.4649803Z docker: 'docker stop' requires at least 1 argument 2025-12-04T10:32:29.4649924Z 2025-12-04T10:32:29.4649997Z Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...] 2025-12-04T10:32:29.4650106Z 2025-12-04T10:32:29.4650176Z See 'docker stop --help' for more information 2025-12-04T10:32:29.4733879Z Total reclaimed space: 0B 2025-12-04T10:32:29.4763783Z ##[group]Run cat /etc/os-release || true 2025-12-04T10:32:29.4764027Z cat /etc/os-release || true 2025-12-04T10:32:29.4764211Z cat /etc/apt/sources.list.d/rocm.list || true 2025-12-04T10:32:29.4764586Z cat /opt/rocm/.info/version || true 2025-12-04T10:32:29.4764749Z whoami 2025-12-04T10:32:29.4769619Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.4769809Z env: 2025-12-04T10:32:29.4769931Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.4770059Z ##[endgroup] 2025-12-04T10:32:29.4789797Z PRETTY_NAME="Ubuntu 22.04.5 LTS" 2025-12-04T10:32:29.4789924Z NAME="Ubuntu" 2025-12-04T10:32:29.4790035Z VERSION_ID="22.04" 2025-12-04T10:32:29.4790137Z VERSION="22.04.5 LTS (Jammy Jellyfish)" 2025-12-04T10:32:29.4790258Z VERSION_CODENAME=jammy 2025-12-04T10:32:29.4790356Z ID=ubuntu 2025-12-04T10:32:29.4790439Z ID_LIKE=debian 2025-12-04T10:32:29.4790563Z HOME_URL="https://www.ubuntu.com/" 2025-12-04T10:32:29.4790692Z SUPPORT_URL="https://help.ubuntu.com/" 2025-12-04T10:32:29.4790843Z BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" 2025-12-04T10:32:29.4791054Z PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" 2025-12-04T10:32:29.4791242Z UBUNTU_CODENAME=jammy 2025-12-04T10:32:29.4796201Z deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.4.1 jammy main 2025-12-04T10:32:29.4802827Z 6.4.1-83 2025-12-04T10:32:29.4808707Z runner 2025-12-04T10:32:29.4828204Z ##[group]Run dpkg -l | grep -E " amdgpu" 2025-12-04T10:32:29.4828447Z dpkg -l | grep -E " amdgpu" 2025-12-04T10:32:29.4833259Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.4833579Z env: 2025-12-04T10:32:29.4833675Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.4833793Z ##[endgroup] 2025-12-04T10:32:29.4883205Z ii amdgpu-core 1:6.4.60401-2164967.22.04 all Core meta package for unified amdgpu driver. 2025-12-04T10:32:29.4883475Z ii amdgpu-install 6.4.60401-2164967.22.04 all AMDGPU driver repository and installer 2025-12-04T10:32:29.4897053Z ##[group]Run rocm-smi 2025-12-04T10:32:29.4897205Z rocm-smi 2025-12-04T10:32:29.4901376Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.4901596Z env: 2025-12-04T10:32:29.4901701Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.4901816Z ##[endgroup] 2025-12-04T10:32:29.5531556Z 2025-12-04T10:32:29.5531716Z 2025-12-04T10:32:29.5532088Z ============================================ ROCm System Management Interface ============================================ 2025-12-04T10:32:29.5532678Z ====================================================== Concise Info ====================================================== 2025-12-04T10:32:29.5533284Z Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2025-12-04T10:32:29.5534176Z  (DID, GUID) (Junction) (Socket) (Mem, Compute, ID)  2025-12-04T10:32:29.5534685Z ========================================================================================================================== 2025-12-04T10:32:29.5535598Z 0 3 0x74a5, 51110 30.0°C 114.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T10:32:29.5536315Z 1 5 0x74a5, 2987 27.0°C 115.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T10:32:29.5536999Z 2 4 0x74a5, 61326 27.0°C 123.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T10:32:29.5537686Z 3 2 0x74a5, 9091 28.0°C 127.0W NPS1, SPX, 0 N/A 900Mhz 0% manual 1000.0W 0% 0% 2025-12-04T10:32:29.5538176Z ========================================================================================================================== 2025-12-04T10:32:29.5538610Z ================================================== End of ROCm SMI Log =================================================== 2025-12-04T10:32:29.5593821Z ##[group]Run rocminfo 2025-12-04T10:32:29.5593950Z rocminfo 2025-12-04T10:32:29.5597647Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.5597793Z env: 2025-12-04T10:32:29.5597884Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.5597995Z ##[endgroup] 2025-12-04T10:32:29.6494753Z ROCk module version 6.12.12 is loaded 2025-12-04T10:32:29.6494899Z ===================== 2025-12-04T10:32:29.6495066Z HSA System Attributes 2025-12-04T10:32:29.6495164Z ===================== 2025-12-04T10:32:29.6495294Z Runtime Version: 1.15 2025-12-04T10:32:29.6495410Z Runtime Ext Version: 1.7 2025-12-04T10:32:29.6495527Z System Timestamp Freq.: 1000.000000MHz 2025-12-04T10:32:29.6495709Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-12-04T10:32:29.6495952Z Machine Model: LARGE 2025-12-04T10:32:29.6496213Z System Endianness: LITTLE 2025-12-04T10:32:29.6496408Z Mwaitx: DISABLED 2025-12-04T10:32:29.6496524Z XNACK enabled: NO 2025-12-04T10:32:29.6496663Z DMAbuf Support: YES 2025-12-04T10:32:29.6496767Z VMM Support: YES 2025-12-04T10:32:29.6496838Z 2025-12-04T10:32:29.6496909Z ========== 2025-12-04T10:32:29.6497012Z HSA Agents 2025-12-04T10:32:29.6497103Z ========== 2025-12-04T10:32:29.6497194Z ******* 2025-12-04T10:32:29.6497281Z Agent 1 2025-12-04T10:32:29.6497553Z ******* 2025-12-04T10:32:29.6497671Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:32:29.6497889Z Uuid: CPU-XX 2025-12-04T10:32:29.6498044Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:32:29.6498209Z Vendor Name: CPU 2025-12-04T10:32:29.6498352Z Feature: None specified 2025-12-04T10:32:29.6498506Z Profile: FULL_PROFILE 2025-12-04T10:32:29.6498690Z Float Round Mode: NEAR 2025-12-04T10:32:29.6498849Z Max Queue Number: 0(0x0) 2025-12-04T10:32:29.6499012Z Queue Min Size: 0(0x0) 2025-12-04T10:32:29.6499162Z Queue Max Size: 0(0x0) 2025-12-04T10:32:29.6499341Z Queue Type: MULTI 2025-12-04T10:32:29.6499535Z Node: 0 2025-12-04T10:32:29.6499717Z Device Type: CPU 2025-12-04T10:32:29.6499856Z Cache Info: 2025-12-04T10:32:29.6499968Z L1: 49152(0xc000) KB 2025-12-04T10:32:29.6500106Z Chip ID: 0(0x0) 2025-12-04T10:32:29.6500309Z ASIC Revision: 0(0x0) 2025-12-04T10:32:29.6500462Z Cacheline Size: 64(0x40) 2025-12-04T10:32:29.6500615Z Max Clock Freq. (MHz): 3300 2025-12-04T10:32:29.6500753Z BDFID: 0 2025-12-04T10:32:29.6500890Z Internal Node ID: 0 2025-12-04T10:32:29.6501034Z Compute Unit: 64 2025-12-04T10:32:29.6501184Z SIMDs per CU: 0 2025-12-04T10:32:29.6501325Z Shader Engines: 0 2025-12-04T10:32:29.6501476Z Shader Arrs. per Eng.: 0 2025-12-04T10:32:29.6501727Z WatchPts on Addr. Ranges:1 2025-12-04T10:32:29.6501875Z Memory Properties: 2025-12-04T10:32:29.6501986Z Features: None 2025-12-04T10:32:29.6502090Z Pool Info: 2025-12-04T10:32:29.6502275Z Pool 1 2025-12-04T10:32:29.6502457Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:32:29.6502688Z Size: 1584776988(0x5e75c71c) KB 2025-12-04T10:32:29.6502840Z Allocatable: TRUE 2025-12-04T10:32:29.6503122Z Alloc Granule: 4KB 2025-12-04T10:32:29.6503375Z Alloc Recommended Granule:4KB 2025-12-04T10:32:29.6503544Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6503734Z Accessible by all: TRUE 2025-12-04T10:32:29.6503871Z Pool 2 2025-12-04T10:32:29.6504005Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:32:29.6504152Z Size: 1584776988(0x5e75c71c) KB 2025-12-04T10:32:29.6504353Z Allocatable: TRUE 2025-12-04T10:32:29.6504521Z Alloc Granule: 4KB 2025-12-04T10:32:29.6504722Z Alloc Recommended Granule:4KB 2025-12-04T10:32:29.6504913Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6505068Z Accessible by all: TRUE 2025-12-04T10:32:29.6505264Z Pool 3 2025-12-04T10:32:29.6505394Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T10:32:29.6505539Z Size: 1584776988(0x5e75c71c) KB 2025-12-04T10:32:29.6505707Z Allocatable: TRUE 2025-12-04T10:32:29.6505870Z Alloc Granule: 4KB 2025-12-04T10:32:29.6506026Z Alloc Recommended Granule:4KB 2025-12-04T10:32:29.6506193Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6506347Z Accessible by all: TRUE 2025-12-04T10:32:29.6506497Z Pool 4 2025-12-04T10:32:29.6506679Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:32:29.6506821Z Size: 1584776988(0x5e75c71c) KB 2025-12-04T10:32:29.6506982Z Allocatable: TRUE 2025-12-04T10:32:29.6507186Z Alloc Granule: 4KB 2025-12-04T10:32:29.6507358Z Alloc Recommended Granule:4KB 2025-12-04T10:32:29.6507541Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6507691Z Accessible by all: TRUE 2025-12-04T10:32:29.6507827Z ISA Info: 2025-12-04T10:32:29.6507937Z ******* 2025-12-04T10:32:29.6508066Z Agent 2 2025-12-04T10:32:29.6508167Z ******* 2025-12-04T10:32:29.6508286Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:32:29.6508426Z Uuid: CPU-XX 2025-12-04T10:32:29.6508578Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:32:29.6508794Z Vendor Name: CPU 2025-12-04T10:32:29.6509004Z Feature: None specified 2025-12-04T10:32:29.6509174Z Profile: FULL_PROFILE 2025-12-04T10:32:29.6509370Z Float Round Mode: NEAR 2025-12-04T10:32:29.6509617Z Max Queue Number: 0(0x0) 2025-12-04T10:32:29.6509838Z Queue Min Size: 0(0x0) 2025-12-04T10:32:29.6509985Z Queue Max Size: 0(0x0) 2025-12-04T10:32:29.6510230Z Queue Type: MULTI 2025-12-04T10:32:29.6510377Z Node: 1 2025-12-04T10:32:29.6510592Z Device Type: CPU 2025-12-04T10:32:29.6510746Z Cache Info: 2025-12-04T10:32:29.6510862Z L1: 49152(0xc000) KB 2025-12-04T10:32:29.6511061Z Chip ID: 0(0x0) 2025-12-04T10:32:29.6511204Z ASIC Revision: 0(0x0) 2025-12-04T10:32:29.6511360Z Cacheline Size: 64(0x40) 2025-12-04T10:32:29.6511558Z Max Clock Freq. (MHz): 3300 2025-12-04T10:32:29.6511701Z BDFID: 0 2025-12-04T10:32:29.6511845Z Internal Node ID: 1 2025-12-04T10:32:29.6512024Z Compute Unit: 64 2025-12-04T10:32:29.6512211Z SIMDs per CU: 0 2025-12-04T10:32:29.6512408Z Shader Engines: 0 2025-12-04T10:32:29.6512558Z Shader Arrs. per Eng.: 0 2025-12-04T10:32:29.6512791Z WatchPts on Addr. Ranges:1 2025-12-04T10:32:29.6512927Z Memory Properties: 2025-12-04T10:32:29.6513147Z Features: None 2025-12-04T10:32:29.6513263Z Pool Info: 2025-12-04T10:32:29.6513362Z Pool 1 2025-12-04T10:32:29.6513532Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:32:29.6513707Z Size: 1585311804(0x5e7df03c) KB 2025-12-04T10:32:29.6513881Z Allocatable: TRUE 2025-12-04T10:32:29.6514035Z Alloc Granule: 4KB 2025-12-04T10:32:29.6514206Z Alloc Recommended Granule:4KB 2025-12-04T10:32:29.6514475Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6514634Z Accessible by all: TRUE 2025-12-04T10:32:29.6514770Z Pool 2 2025-12-04T10:32:29.6514900Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:32:29.6515084Z Size: 1585311804(0x5e7df03c) KB 2025-12-04T10:32:29.6515232Z Allocatable: TRUE 2025-12-04T10:32:29.6515448Z Alloc Granule: 4KB 2025-12-04T10:32:29.6515621Z Alloc Recommended Granule:4KB 2025-12-04T10:32:29.6515803Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6516012Z Accessible by all: TRUE 2025-12-04T10:32:29.6516226Z Pool 3 2025-12-04T10:32:29.6516360Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T10:32:29.6516521Z Size: 1585311804(0x5e7df03c) KB 2025-12-04T10:32:29.6516674Z Allocatable: TRUE 2025-12-04T10:32:29.6516903Z Alloc Granule: 4KB 2025-12-04T10:32:29.6517080Z Alloc Recommended Granule:4KB 2025-12-04T10:32:29.6517295Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6517450Z Accessible by all: TRUE 2025-12-04T10:32:29.6517582Z Pool 4 2025-12-04T10:32:29.6517741Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:32:29.6517886Z Size: 1585311804(0x5e7df03c) KB 2025-12-04T10:32:29.6518069Z Allocatable: TRUE 2025-12-04T10:32:29.6518299Z Alloc Granule: 4KB 2025-12-04T10:32:29.6518477Z Alloc Recommended Granule:4KB 2025-12-04T10:32:29.6518693Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6518864Z Accessible by all: TRUE 2025-12-04T10:32:29.6518998Z ISA Info: 2025-12-04T10:32:29.6519171Z ******* 2025-12-04T10:32:29.6519344Z Agent 3 2025-12-04T10:32:29.6519439Z ******* 2025-12-04T10:32:29.6519551Z Name: gfx942 2025-12-04T10:32:29.6519745Z Uuid: GPU-41f9686c3d70a95c 2025-12-04T10:32:29.6519894Z Marketing Name: AMD Instinct MI325X 2025-12-04T10:32:29.6520072Z Vendor Name: AMD 2025-12-04T10:32:29.6520224Z Feature: KERNEL_DISPATCH 2025-12-04T10:32:29.6520415Z Profile: BASE_PROFILE 2025-12-04T10:32:29.6520589Z Float Round Mode: NEAR 2025-12-04T10:32:29.6520751Z Max Queue Number: 128(0x80) 2025-12-04T10:32:29.6520904Z Queue Min Size: 64(0x40) 2025-12-04T10:32:29.6521086Z Queue Max Size: 131072(0x20000) 2025-12-04T10:32:29.6521228Z Queue Type: MULTI 2025-12-04T10:32:29.6521370Z Node: 2 2025-12-04T10:32:29.6521506Z Device Type: GPU 2025-12-04T10:32:29.6521640Z Cache Info: 2025-12-04T10:32:29.6521755Z L1: 32(0x20) KB 2025-12-04T10:32:29.6521890Z L2: 4096(0x1000) KB 2025-12-04T10:32:29.6522024Z L3: 262144(0x40000) KB 2025-12-04T10:32:29.6522191Z Chip ID: 29861(0x74a5) 2025-12-04T10:32:29.6522386Z ASIC Revision: 1(0x1) 2025-12-04T10:32:29.6522564Z Cacheline Size: 128(0x80) 2025-12-04T10:32:29.6522712Z Max Clock Freq. (MHz): 2100 2025-12-04T10:32:29.6522913Z BDFID: 29952 2025-12-04T10:32:29.6523057Z Internal Node ID: 2 2025-12-04T10:32:29.6523254Z Compute Unit: 304 2025-12-04T10:32:29.6523468Z SIMDs per CU: 4 2025-12-04T10:32:29.6523655Z Shader Engines: 32 2025-12-04T10:32:29.6523803Z Shader Arrs. per Eng.: 1 2025-12-04T10:32:29.6524019Z WatchPts on Addr. Ranges:4 2025-12-04T10:32:29.6524189Z Coherent Host Access: FALSE 2025-12-04T10:32:29.6524367Z Memory Properties: 2025-12-04T10:32:29.6524514Z Features: KERNEL_DISPATCH 2025-12-04T10:32:29.6524654Z Fast F16 Operation: TRUE 2025-12-04T10:32:29.6524905Z Wavefront Size: 64(0x40) 2025-12-04T10:32:29.6525088Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6525228Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6525387Z x 1024(0x400) 2025-12-04T10:32:29.6525553Z y 1024(0x400) 2025-12-04T10:32:29.6525714Z z 1024(0x400) 2025-12-04T10:32:29.6525853Z Max Waves Per CU: 32(0x20) 2025-12-04T10:32:29.6526070Z Max Work-item Per CU: 2048(0x800) 2025-12-04T10:32:29.6526232Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6526409Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6526523Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6526661Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6526865Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6527033Z Max fbarriers/Workgrp: 32 2025-12-04T10:32:29.6532809Z Packet Processor uCode:: 185 2025-12-04T10:32:29.6532974Z SDMA engine uCode:: 24 2025-12-04T10:32:29.6533139Z IOMMU Support:: None 2025-12-04T10:32:29.6533338Z Pool Info: 2025-12-04T10:32:29.6533478Z Pool 1 2025-12-04T10:32:29.6533678Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:32:29.6533844Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6533992Z Allocatable: TRUE 2025-12-04T10:32:29.6534242Z Alloc Granule: 4KB 2025-12-04T10:32:29.6534404Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6534646Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6534837Z Accessible by all: FALSE 2025-12-04T10:32:29.6535003Z Pool 2 2025-12-04T10:32:29.6535135Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:32:29.6535287Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6535432Z Allocatable: TRUE 2025-12-04T10:32:29.6535588Z Alloc Granule: 4KB 2025-12-04T10:32:29.6535899Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6536056Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6536365Z Accessible by all: FALSE 2025-12-04T10:32:29.6536499Z Pool 3 2025-12-04T10:32:29.6536662Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:32:29.6536809Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6537029Z Allocatable: TRUE 2025-12-04T10:32:29.6537270Z Alloc Granule: 4KB 2025-12-04T10:32:29.6537477Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6537700Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6537908Z Accessible by all: FALSE 2025-12-04T10:32:29.6538041Z Pool 4 2025-12-04T10:32:29.6538199Z Segment: GROUP 2025-12-04T10:32:29.6538366Z Size: 64(0x40) KB 2025-12-04T10:32:29.6538504Z Allocatable: FALSE 2025-12-04T10:32:29.6538724Z Alloc Granule: 0KB 2025-12-04T10:32:29.6538882Z Alloc Recommended Granule:0KB 2025-12-04T10:32:29.6539078Z Alloc Alignment: 0KB 2025-12-04T10:32:29.6539264Z Accessible by all: FALSE 2025-12-04T10:32:29.6539480Z ISA Info: 2025-12-04T10:32:29.6539625Z ISA 1 2025-12-04T10:32:29.6539799Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T10:32:29.6539960Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:32:29.6540120Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:32:29.6540280Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6540503Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6540656Z Fast f16: TRUE 2025-12-04T10:32:29.6540812Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6540953Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6541085Z x 1024(0x400) 2025-12-04T10:32:29.6541231Z y 1024(0x400) 2025-12-04T10:32:29.6541399Z z 1024(0x400) 2025-12-04T10:32:29.6541605Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6541779Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6541902Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6542034Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6542224Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6542369Z FBarrier Max Size: 32 2025-12-04T10:32:29.6542537Z ISA 2 2025-12-04T10:32:29.6542759Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T10:32:29.6542990Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:32:29.6543211Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:32:29.6543372Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6543555Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6543704Z Fast f16: TRUE 2025-12-04T10:32:29.6543875Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6544077Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6544223Z x 1024(0x400) 2025-12-04T10:32:29.6544352Z y 1024(0x400) 2025-12-04T10:32:29.6544497Z z 1024(0x400) 2025-12-04T10:32:29.6544674Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6544812Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6544962Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6545095Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6545226Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6545454Z FBarrier Max Size: 32 2025-12-04T10:32:29.6545612Z ******* 2025-12-04T10:32:29.6545708Z Agent 4 2025-12-04T10:32:29.6545836Z ******* 2025-12-04T10:32:29.6545994Z Name: gfx942 2025-12-04T10:32:29.6546131Z Uuid: GPU-e2954cd4b2ef3669 2025-12-04T10:32:29.6546291Z Marketing Name: AMD Instinct MI325X 2025-12-04T10:32:29.6546447Z Vendor Name: AMD 2025-12-04T10:32:29.6546601Z Feature: KERNEL_DISPATCH 2025-12-04T10:32:29.6546826Z Profile: BASE_PROFILE 2025-12-04T10:32:29.6547004Z Float Round Mode: NEAR 2025-12-04T10:32:29.6547156Z Max Queue Number: 128(0x80) 2025-12-04T10:32:29.6547411Z Queue Min Size: 64(0x40) 2025-12-04T10:32:29.6547572Z Queue Max Size: 131072(0x20000) 2025-12-04T10:32:29.6547770Z Queue Type: MULTI 2025-12-04T10:32:29.6547909Z Node: 3 2025-12-04T10:32:29.6561746Z Device Type: GPU 2025-12-04T10:32:29.6561911Z Cache Info: 2025-12-04T10:32:29.6562039Z L1: 32(0x20) KB 2025-12-04T10:32:29.6562183Z L2: 4096(0x1000) KB 2025-12-04T10:32:29.6562321Z L3: 262144(0x40000) KB 2025-12-04T10:32:29.6562460Z Chip ID: 29861(0x74a5) 2025-12-04T10:32:29.6562608Z ASIC Revision: 1(0x1) 2025-12-04T10:32:29.6562759Z Cacheline Size: 128(0x80) 2025-12-04T10:32:29.6562912Z Max Clock Freq. (MHz): 2100 2025-12-04T10:32:29.6563055Z BDFID: 1280 2025-12-04T10:32:29.6563197Z Internal Node ID: 3 2025-12-04T10:32:29.6563356Z Compute Unit: 304 2025-12-04T10:32:29.6563507Z SIMDs per CU: 4 2025-12-04T10:32:29.6563721Z Shader Engines: 32 2025-12-04T10:32:29.6563885Z Shader Arrs. per Eng.: 1 2025-12-04T10:32:29.6564053Z WatchPts on Addr. Ranges:4 2025-12-04T10:32:29.6564213Z Coherent Host Access: FALSE 2025-12-04T10:32:29.6564359Z Memory Properties: 2025-12-04T10:32:29.6564477Z Features: KERNEL_DISPATCH 2025-12-04T10:32:29.6564622Z Fast F16 Operation: TRUE 2025-12-04T10:32:29.6564778Z Wavefront Size: 64(0x40) 2025-12-04T10:32:29.6564928Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6565077Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6565213Z x 1024(0x400) 2025-12-04T10:32:29.6565338Z y 1024(0x400) 2025-12-04T10:32:29.6565469Z z 1024(0x400) 2025-12-04T10:32:29.6565605Z Max Waves Per CU: 32(0x20) 2025-12-04T10:32:29.6565761Z Max Work-item Per CU: 2048(0x800) 2025-12-04T10:32:29.6565915Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6566048Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6566168Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6566306Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6566429Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6566577Z Max fbarriers/Workgrp: 32 2025-12-04T10:32:29.6566743Z Packet Processor uCode:: 185 2025-12-04T10:32:29.6566914Z SDMA engine uCode:: 24 2025-12-04T10:32:29.6567071Z IOMMU Support:: None 2025-12-04T10:32:29.6567202Z Pool Info: 2025-12-04T10:32:29.6567311Z Pool 1 2025-12-04T10:32:29.6567443Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:32:29.6567594Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6567744Z Allocatable: TRUE 2025-12-04T10:32:29.6567901Z Alloc Granule: 4KB 2025-12-04T10:32:29.6568110Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6568276Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6568431Z Accessible by all: FALSE 2025-12-04T10:32:29.6568571Z Pool 2 2025-12-04T10:32:29.6568705Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:32:29.6568854Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6569001Z Allocatable: TRUE 2025-12-04T10:32:29.6569154Z Alloc Granule: 4KB 2025-12-04T10:32:29.6569311Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6569469Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6569659Z Accessible by all: FALSE 2025-12-04T10:32:29.6569803Z Pool 3 2025-12-04T10:32:29.6569931Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:32:29.6570072Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6570212Z Allocatable: TRUE 2025-12-04T10:32:29.6570368Z Alloc Granule: 4KB 2025-12-04T10:32:29.6570556Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6570718Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6570876Z Accessible by all: FALSE 2025-12-04T10:32:29.6571004Z Pool 4 2025-12-04T10:32:29.6571125Z Segment: GROUP 2025-12-04T10:32:29.6571262Z Size: 64(0x40) KB 2025-12-04T10:32:29.6571414Z Allocatable: FALSE 2025-12-04T10:32:29.6571568Z Alloc Granule: 0KB 2025-12-04T10:32:29.6571721Z Alloc Recommended Granule:0KB 2025-12-04T10:32:29.6571879Z Alloc Alignment: 0KB 2025-12-04T10:32:29.6572035Z Accessible by all: FALSE 2025-12-04T10:32:29.6572170Z ISA Info: 2025-12-04T10:32:29.6572268Z ISA 1 2025-12-04T10:32:29.6572393Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T10:32:29.6572559Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:32:29.6572722Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:32:29.6572875Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6573043Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6573196Z Fast f16: TRUE 2025-12-04T10:32:29.6573343Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6573484Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6573610Z x 1024(0x400) 2025-12-04T10:32:29.6573743Z y 1024(0x400) 2025-12-04T10:32:29.6573872Z z 1024(0x400) 2025-12-04T10:32:29.6574006Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6574141Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6574258Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6574382Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6574553Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6574695Z FBarrier Max Size: 32 2025-12-04T10:32:29.6574828Z ISA 2 2025-12-04T10:32:29.6574974Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T10:32:29.6575144Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:32:29.6575305Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:32:29.6575469Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6575628Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6575782Z Fast f16: TRUE 2025-12-04T10:32:29.6575934Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6576075Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6576204Z x 1024(0x400) 2025-12-04T10:32:29.6576335Z y 1024(0x400) 2025-12-04T10:32:29.6576465Z z 1024(0x400) 2025-12-04T10:32:29.6576605Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6576741Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6576864Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6577047Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6577174Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6577320Z FBarrier Max Size: 32 2025-12-04T10:32:29.6577458Z ******* 2025-12-04T10:32:29.6577557Z Agent 5 2025-12-04T10:32:29.6577658Z ******* 2025-12-04T10:32:29.6577770Z Name: gfx942 2025-12-04T10:32:29.6577921Z Uuid: GPU-d34a48edc983a6e7 2025-12-04T10:32:29.6578077Z Marketing Name: AMD Instinct MI325X 2025-12-04T10:32:29.6578231Z Vendor Name: AMD 2025-12-04T10:32:29.6578385Z Feature: KERNEL_DISPATCH 2025-12-04T10:32:29.6578540Z Profile: BASE_PROFILE 2025-12-04T10:32:29.6578699Z Float Round Mode: NEAR 2025-12-04T10:32:29.6578856Z Max Queue Number: 128(0x80) 2025-12-04T10:32:29.6579004Z Queue Min Size: 64(0x40) 2025-12-04T10:32:29.6579151Z Queue Max Size: 131072(0x20000) 2025-12-04T10:32:29.6579301Z Queue Type: MULTI 2025-12-04T10:32:29.6579437Z Node: 4 2025-12-04T10:32:29.6579629Z Device Type: GPU 2025-12-04T10:32:29.6579764Z Cache Info: 2025-12-04T10:32:29.6579876Z L1: 32(0x20) KB 2025-12-04T10:32:29.6580012Z L2: 4096(0x1000) KB 2025-12-04T10:32:29.6580139Z L3: 262144(0x40000) KB 2025-12-04T10:32:29.6580276Z Chip ID: 29861(0x74a5) 2025-12-04T10:32:29.6580429Z ASIC Revision: 1(0x1) 2025-12-04T10:32:29.6580578Z Cacheline Size: 128(0x80) 2025-12-04T10:32:29.6580734Z Max Clock Freq. (MHz): 2100 2025-12-04T10:32:29.6580879Z BDFID: 25856 2025-12-04T10:32:29.6581021Z Internal Node ID: 4 2025-12-04T10:32:29.6581239Z Compute Unit: 304 2025-12-04T10:32:29.6581385Z SIMDs per CU: 4 2025-12-04T10:32:29.6581537Z Shader Engines: 32 2025-12-04T10:32:29.6581693Z Shader Arrs. per Eng.: 1 2025-12-04T10:32:29.6581851Z WatchPts on Addr. Ranges:4 2025-12-04T10:32:29.6582013Z Coherent Host Access: FALSE 2025-12-04T10:32:29.6582156Z Memory Properties: 2025-12-04T10:32:29.6582270Z Features: KERNEL_DISPATCH 2025-12-04T10:32:29.6582412Z Fast F16 Operation: TRUE 2025-12-04T10:32:29.6582566Z Wavefront Size: 64(0x40) 2025-12-04T10:32:29.6582722Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6582864Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6582990Z x 1024(0x400) 2025-12-04T10:32:29.6583119Z y 1024(0x400) 2025-12-04T10:32:29.6583246Z z 1024(0x400) 2025-12-04T10:32:29.6583383Z Max Waves Per CU: 32(0x20) 2025-12-04T10:32:29.6583535Z Max Work-item Per CU: 2048(0x800) 2025-12-04T10:32:29.6583691Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6583866Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6583981Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6584110Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6584237Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6584385Z Max fbarriers/Workgrp: 32 2025-12-04T10:32:29.6584544Z Packet Processor uCode:: 185 2025-12-04T10:32:29.6584708Z SDMA engine uCode:: 24 2025-12-04T10:32:29.6584863Z IOMMU Support:: None 2025-12-04T10:32:29.6584993Z Pool Info: 2025-12-04T10:32:29.6585102Z Pool 1 2025-12-04T10:32:29.6585231Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:32:29.6585384Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6585540Z Allocatable: TRUE 2025-12-04T10:32:29.6585688Z Alloc Granule: 4KB 2025-12-04T10:32:29.6585852Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6586015Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6586170Z Accessible by all: FALSE 2025-12-04T10:32:29.6586310Z Pool 2 2025-12-04T10:32:29.6586436Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:32:29.6586585Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6586732Z Allocatable: TRUE 2025-12-04T10:32:29.6586881Z Alloc Granule: 4KB 2025-12-04T10:32:29.6587045Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6587214Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6587370Z Accessible by all: FALSE 2025-12-04T10:32:29.6587511Z Pool 3 2025-12-04T10:32:29.6587641Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:32:29.6587787Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6587936Z Allocatable: TRUE 2025-12-04T10:32:29.6588120Z Alloc Granule: 4KB 2025-12-04T10:32:29.6588284Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6588448Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6588601Z Accessible by all: FALSE 2025-12-04T10:32:29.6588742Z Pool 4 2025-12-04T10:32:29.6588873Z Segment: GROUP 2025-12-04T10:32:29.6589012Z Size: 64(0x40) KB 2025-12-04T10:32:29.6589158Z Allocatable: FALSE 2025-12-04T10:32:29.6589308Z Alloc Granule: 0KB 2025-12-04T10:32:29.6589471Z Alloc Recommended Granule:0KB 2025-12-04T10:32:29.6589695Z Alloc Alignment: 0KB 2025-12-04T10:32:29.6589854Z Accessible by all: FALSE 2025-12-04T10:32:29.6589994Z ISA Info: 2025-12-04T10:32:29.6590100Z ISA 1 2025-12-04T10:32:29.6590227Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T10:32:29.6590392Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:32:29.6590548Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:32:29.6590747Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6590912Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6591061Z Fast f16: TRUE 2025-12-04T10:32:29.6591213Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6591362Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6591491Z x 1024(0x400) 2025-12-04T10:32:29.6591626Z y 1024(0x400) 2025-12-04T10:32:29.6591758Z z 1024(0x400) 2025-12-04T10:32:29.6591898Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6592040Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6592160Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6592303Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6592438Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6592581Z FBarrier Max Size: 32 2025-12-04T10:32:29.6592722Z ISA 2 2025-12-04T10:32:29.6592865Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T10:32:29.6593035Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:32:29.6593200Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:32:29.6593357Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6593522Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6593677Z Fast f16: TRUE 2025-12-04T10:32:29.6593825Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6593974Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6594107Z x 1024(0x400) 2025-12-04T10:32:29.6594234Z y 1024(0x400) 2025-12-04T10:32:29.6594365Z z 1024(0x400) 2025-12-04T10:32:29.6594508Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6594642Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6594810Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6594938Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6595073Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6595218Z FBarrier Max Size: 32 2025-12-04T10:32:29.6595351Z ******* 2025-12-04T10:32:29.6595453Z Agent 6 2025-12-04T10:32:29.6595553Z ******* 2025-12-04T10:32:29.6595665Z Name: gfx942 2025-12-04T10:32:29.6595810Z Uuid: GPU-f24a9834b47f1628 2025-12-04T10:32:29.6595960Z Marketing Name: AMD Instinct MI325X 2025-12-04T10:32:29.6596120Z Vendor Name: AMD 2025-12-04T10:32:29.6596274Z Feature: KERNEL_DISPATCH 2025-12-04T10:32:29.6596426Z Profile: BASE_PROFILE 2025-12-04T10:32:29.6596580Z Float Round Mode: NEAR 2025-12-04T10:32:29.6596734Z Max Queue Number: 128(0x80) 2025-12-04T10:32:29.6596881Z Queue Min Size: 64(0x40) 2025-12-04T10:32:29.6597029Z Queue Max Size: 131072(0x20000) 2025-12-04T10:32:29.6597204Z Queue Type: MULTI 2025-12-04T10:32:29.6597346Z Node: 5 2025-12-04T10:32:29.6597488Z Device Type: GPU 2025-12-04T10:32:29.6597617Z Cache Info: 2025-12-04T10:32:29.6597734Z L1: 32(0x20) KB 2025-12-04T10:32:29.6597868Z L2: 4096(0x1000) KB 2025-12-04T10:32:29.6597998Z L3: 262144(0x40000) KB 2025-12-04T10:32:29.6598136Z Chip ID: 29861(0x74a5) 2025-12-04T10:32:29.6598281Z ASIC Revision: 1(0x1) 2025-12-04T10:32:29.6598437Z Cacheline Size: 128(0x80) 2025-12-04T10:32:29.6598594Z Max Clock Freq. (MHz): 2100 2025-12-04T10:32:29.6598734Z BDFID: 5376 2025-12-04T10:32:29.6598888Z Internal Node ID: 5 2025-12-04T10:32:29.6599041Z Compute Unit: 304 2025-12-04T10:32:29.6599186Z SIMDs per CU: 4 2025-12-04T10:32:29.6599337Z Shader Engines: 32 2025-12-04T10:32:29.6599494Z Shader Arrs. per Eng.: 1 2025-12-04T10:32:29.6599703Z WatchPts on Addr. Ranges:4 2025-12-04T10:32:29.6599865Z Coherent Host Access: FALSE 2025-12-04T10:32:29.6600003Z Memory Properties: 2025-12-04T10:32:29.6600121Z Features: KERNEL_DISPATCH 2025-12-04T10:32:29.6600266Z Fast F16 Operation: TRUE 2025-12-04T10:32:29.6600420Z Wavefront Size: 64(0x40) 2025-12-04T10:32:29.6600577Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6600710Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6600827Z x 1024(0x400) 2025-12-04T10:32:29.6600945Z y 1024(0x400) 2025-12-04T10:32:29.6601066Z z 1024(0x400) 2025-12-04T10:32:29.6601202Z Max Waves Per CU: 32(0x20) 2025-12-04T10:32:29.6601388Z Max Work-item Per CU: 2048(0x800) 2025-12-04T10:32:29.6601540Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6601673Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6601783Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6601911Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6602034Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6602182Z Max fbarriers/Workgrp: 32 2025-12-04T10:32:29.6602344Z Packet Processor uCode:: 185 2025-12-04T10:32:29.6602500Z SDMA engine uCode:: 24 2025-12-04T10:32:29.6602654Z IOMMU Support:: None 2025-12-04T10:32:29.6602789Z Pool Info: 2025-12-04T10:32:29.6602888Z Pool 1 2025-12-04T10:32:29.6603015Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:32:29.6603170Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6603317Z Allocatable: TRUE 2025-12-04T10:32:29.6603468Z Alloc Granule: 4KB 2025-12-04T10:32:29.6603623Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6603783Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6603984Z Accessible by all: FALSE 2025-12-04T10:32:29.6604112Z Pool 2 2025-12-04T10:32:29.6604235Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:32:29.6604377Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6604514Z Allocatable: TRUE 2025-12-04T10:32:29.6604663Z Alloc Granule: 4KB 2025-12-04T10:32:29.6604825Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6604981Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6605268Z Accessible by all: FALSE 2025-12-04T10:32:29.6605581Z Pool 3 2025-12-04T10:32:29.6605703Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:32:29.6605852Z Size: 268419072(0xfffc000) KB 2025-12-04T10:32:29.6605992Z Allocatable: TRUE 2025-12-04T10:32:29.6606152Z Alloc Granule: 4KB 2025-12-04T10:32:29.6606309Z Alloc Recommended Granule:2048KB 2025-12-04T10:32:29.6606468Z Alloc Alignment: 4KB 2025-12-04T10:32:29.6606627Z Accessible by all: FALSE 2025-12-04T10:32:29.6606763Z Pool 4 2025-12-04T10:32:29.6606882Z Segment: GROUP 2025-12-04T10:32:29.6607019Z Size: 64(0x40) KB 2025-12-04T10:32:29.6607161Z Allocatable: FALSE 2025-12-04T10:32:29.6607307Z Alloc Granule: 0KB 2025-12-04T10:32:29.6607464Z Alloc Recommended Granule:0KB 2025-12-04T10:32:29.6607616Z Alloc Alignment: 0KB 2025-12-04T10:32:29.6607769Z Accessible by all: FALSE 2025-12-04T10:32:29.6607901Z ISA Info: 2025-12-04T10:32:29.6607996Z ISA 1 2025-12-04T10:32:29.6608119Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T10:32:29.6608312Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:32:29.6608466Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:32:29.6608620Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6608772Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6608920Z Fast f16: TRUE 2025-12-04T10:32:29.6609067Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6609207Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6609335Z x 1024(0x400) 2025-12-04T10:32:29.6609462Z y 1024(0x400) 2025-12-04T10:32:29.6609637Z z 1024(0x400) 2025-12-04T10:32:29.6609776Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6609908Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6610031Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6610158Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6610281Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6610422Z FBarrier Max Size: 32 2025-12-04T10:32:29.6610555Z ISA 2 2025-12-04T10:32:29.6610729Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T10:32:29.6610897Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:32:29.6611050Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:32:29.6611208Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6611388Z Default Rounding Mode: NEAR 2025-12-04T10:32:29.6611538Z Fast f16: TRUE 2025-12-04T10:32:29.6611687Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:32:29.6611829Z Workgroup Max Size per Dimension: 2025-12-04T10:32:29.6611949Z x 1024(0x400) 2025-12-04T10:32:29.6612074Z y 1024(0x400) 2025-12-04T10:32:29.6612197Z z 1024(0x400) 2025-12-04T10:32:29.6612337Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:32:29.6612474Z Grid Max Size per Dimension: 2025-12-04T10:32:29.6612588Z x 4294967295(0xffffffff) 2025-12-04T10:32:29.6612716Z y 4294967295(0xffffffff) 2025-12-04T10:32:29.6612845Z z 4294967295(0xffffffff) 2025-12-04T10:32:29.6612983Z FBarrier Max Size: 32 2025-12-04T10:32:29.6613118Z *** Done *** 2025-12-04T10:32:29.6623279Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T10:32:29.6623458Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T10:32:29.6623733Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-12-04T10:32:29.6623992Z if [[ $ngpu -eq 0 ]]; then 2025-12-04T10:32:29.6624144Z  echo "Error: Failed to detect any GPUs on the runner" 2025-12-04T10:32:29.6624288Z  echo "$msg" 2025-12-04T10:32:29.6624382Z  exit 1 2025-12-04T10:32:29.6624474Z fi 2025-12-04T10:32:29.6627353Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.6627494Z env: 2025-12-04T10:32:29.6627582Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.6627681Z ##[endgroup] 2025-12-04T10:32:29.7730121Z ##[group]Run pytorch/pytorch/.github/actions/diskspace-cleanup@main 2025-12-04T10:32:29.7730332Z with: 2025-12-04T10:32:29.7730450Z diskspace-cutoff: 70 2025-12-04T10:32:29.7730573Z env: 2025-12-04T10:32:29.7730687Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.7730819Z ##[endgroup] 2025-12-04T10:32:29.7753418Z ##[group]Run set -ex 2025-12-04T10:32:29.7753553Z set -ex 2025-12-04T10:32:29.7753658Z diskspace_cutoff=70 2025-12-04T10:32:29.7753805Z docker_root_dir=$(docker info -f '{{.DockerRootDir}}') 2025-12-04T10:32:29.7753986Z if [ ! -d "$docker_root_dir" ]; then 2025-12-04T10:32:29.7754188Z  echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check." 2025-12-04T10:32:29.7754374Z  exit 0 2025-12-04T10:32:29.7754473Z fi 2025-12-04T10:32:29.7754644Z diskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-12-04T10:32:29.7754974Z msg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified" 2025-12-04T10:32:29.7755258Z if [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then 2025-12-04T10:32:29.7755411Z  docker system prune -af 2025-12-04T10:32:29.7755598Z  diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //') 2025-12-04T10:32:29.7755814Z  if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then 2025-12-04T10:32:29.7756120Z  diskspace_cutoff_int=$((diskspace_cutoff + 0)) 2025-12-04T10:32:29.7756277Z  difference=$((100 - diskspace_cutoff_int)) 2025-12-04T10:32:29.7756487Z  echo "Error: Available diskspace is less than $difference percent. Not enough diskspace." 2025-12-04T10:32:29.7756677Z  echo "$msg" 2025-12-04T10:32:29.7756779Z  exit 1 2025-12-04T10:32:29.7756880Z  else 2025-12-04T10:32:29.7756996Z  difference=$((diskspace - diskspace_new)) 2025-12-04T10:32:29.7757155Z  echo "Diskspace saved: $difference percent" 2025-12-04T10:32:29.7757289Z  fi 2025-12-04T10:32:29.7757374Z fi 2025-12-04T10:32:29.7760538Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.7760687Z env: 2025-12-04T10:32:29.7760775Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.7760882Z ##[endgroup] 2025-12-04T10:32:29.7777063Z + diskspace_cutoff=70 2025-12-04T10:32:29.7780181Z ++ docker info -f '{{.DockerRootDir}}' 2025-12-04T10:32:29.8128643Z + docker_root_dir=/home/runner/docker-data 2025-12-04T10:32:29.8130225Z + '[' '!' -d /home/runner/docker-data ']' 2025-12-04T10:32:29.8137142Z ++ df -H --output=pcent /home/runner/docker-data 2025-12-04T10:32:29.8137359Z ++ sed -n 2p 2025-12-04T10:32:29.8137484Z ++ sed s/%// 2025-12-04T10:32:29.8139438Z ++ sed 's/ //' 2025-12-04T10:32:29.8155005Z + diskspace=' 3' 2025-12-04T10:32:29.8156499Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified' 2025-12-04T10:32:29.8157104Z + [[ 3 -ge 70 ]] 2025-12-04T10:32:29.8185951Z ##[group]Run RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-12-04T10:32:29.8186194Z RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts" 2025-12-04T10:32:29.8186355Z rm -rf "${RUNNER_ARTIFACT_DIR}" 2025-12-04T10:32:29.8186505Z mkdir -p "${RUNNER_ARTIFACT_DIR}" 2025-12-04T10:32:29.8186703Z echo "RUNNER_ARTIFACT_DIR=${RUNNER_ARTIFACT_DIR}" >> "${GITHUB_ENV}" 2025-12-04T10:32:29.8186883Z  2025-12-04T10:32:29.8187011Z RUNNER_TEST_RESULTS_DIR="${RUNNER_TEMP}/test-results" 2025-12-04T10:32:29.8187179Z rm -rf "${RUNNER_TEST_RESULTS_DIR}" 2025-12-04T10:32:29.8187327Z mkdir -p "${RUNNER_TEST_RESULTS_DIR}" 2025-12-04T10:32:29.8187517Z echo "RUNNER_TEST_RESULTS_DIR=${RUNNER_TEST_RESULTS_DIR}" >> "${GITHUB_ENV}" 2025-12-04T10:32:29.8187695Z  2025-12-04T10:32:29.8187975Z RUNNER_DOCS_DIR="${RUNNER_TEMP}/docs" 2025-12-04T10:32:29.8188112Z rm -rf "${RUNNER_DOCS_DIR}" 2025-12-04T10:32:29.8188244Z mkdir -p "${RUNNER_DOCS_DIR}" 2025-12-04T10:32:29.8188407Z echo "RUNNER_DOCS_DIR=${RUNNER_DOCS_DIR}" >> "${GITHUB_ENV}" 2025-12-04T10:32:29.8192991Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.8193137Z env: 2025-12-04T10:32:29.8193231Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.8193341Z ##[endgroup] 2025-12-04T10:32:29.8274922Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:32:29.8275199Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:32:29.8275402Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:32:29.8279723Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.8279879Z env: 2025-12-04T10:32:29.8280002Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.8280149Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:29.8280329Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:29.8280505Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:29.8280648Z ##[endgroup] 2025-12-04T10:32:29.8323696Z ##[group]Run # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-12-04T10:32:29.8324115Z # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py. 2025-12-04T10:32:29.8324324Z # Add render group for container creation. 2025-12-04T10:32:29.8324503Z render_gid=`cat /etc/group | grep render | cut -d: -f3` 2025-12-04T10:32:29.8324712Z # Ensure GPU isolation if pod is part of kubernetes setup with DEVICE_FLAG. 2025-12-04T10:32:29.8324918Z if [ -f "/etc/podinfo/gha-render-devices" ]; then 2025-12-04T10:32:29.8325100Z  DEVICE_FLAG=$(cat /etc/podinfo/gha-render-devices) 2025-12-04T10:32:29.8325244Z else 2025-12-04T10:32:29.8325351Z  DEVICE_FLAG="--device /dev/dri" 2025-12-04T10:32:29.8325467Z fi 2025-12-04T10:32:29.8325652Z # The --group-add daemon and --group-add bin are needed in the Ubuntu 24.04 and Almalinux OSs respectively. 2025-12-04T10:32:29.8325934Z # This is due to the device files (/dev/kfd & /dev/dri) being owned by video group on bare metal. 2025-12-04T10:32:29.8326189Z # This video group ID maps to subgid 1 inside the docker image due to the /etc/subgid entries. 2025-12-04T10:32:29.8326458Z # The group name corresponding to group ID 1 can change depending on the OS, so both are necessary. 2025-12-04T10:32:29.8326906Z echo "GPU_FLAG=--device=/dev/mem --device=/dev/kfd $DEVICE_FLAG --group-add video --group-add $render_gid --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host" >> "${GITHUB_ENV}" 2025-12-04T10:32:29.8329931Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:29.8330073Z env: 2025-12-04T10:32:29.8330165Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.8330293Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:29.8330472Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:29.8330637Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:29.8330768Z ##[endgroup] 2025-12-04T10:32:29.8401119Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 2025-12-04T10:32:29.8401317Z with: 2025-12-04T10:32:29.8401467Z role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_s3_and_ecr_read_only 2025-12-04T10:32:29.8401638Z aws-region: us-east-1 2025-12-04T10:32:29.8401754Z role-duration-seconds: 18000 2025-12-04T10:32:29.8401879Z audience: sts.amazonaws.com 2025-12-04T10:32:29.8401988Z env: 2025-12-04T10:32:29.8402083Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:29.8402321Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:29.8402495Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:29.8402659Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:29.8403159Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:29.8403649Z ##[endgroup] 2025-12-04T10:32:30.1515822Z Assuming role with OIDC 2025-12-04T10:32:30.5150809Z Authenticated as assumedRoleId AROAUPVRELQNLLCOPFEJR:GitHubActions 2025-12-04T10:32:30.6112937Z ##[group]Run aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076 2025-12-04T10:32:30.6113158Z with: 2025-12-04T10:32:30.6113265Z mask-password: true 2025-12-04T10:32:30.6113397Z registry-type: private 2025-12-04T10:32:30.6113523Z skip-logout: false 2025-12-04T10:32:30.6113635Z env: 2025-12-04T10:32:30.6113743Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:30.6113890Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:30.6114078Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:30.6114258Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:30.6114956Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:30.6115467Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:30.6115594Z AWS_REGION: us-east-1 2025-12-04T10:32:30.6116030Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:30.6116199Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:30.6118268Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:30.6118381Z ##[endgroup] 2025-12-04T10:32:31.0445875Z Logging into registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:31.6974061Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:32:31.6974377Z env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:32:31.6974640Z env | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:32:31.6974905Z env | grep '^RUNNER' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:32:31.6980763Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:31.6980952Z env: 2025-12-04T10:32:31.6981081Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:31.6981262Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:31.6981495Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:31.6981713Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:31.6982389Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:31.6982982Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:31.6983113Z AWS_REGION: us-east-1 2025-12-04T10:32:31.6983416Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:31.6983584Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:31.6985742Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:31.6985859Z ##[endgroup] 2025-12-04T10:32:31.7089552Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T10:32:31.7089966Z ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx') 2025-12-04T10:32:31.7090217Z if [[ $ngpu -lt 2 ]]; then #We are temporarily reducing this down to 2 from 4 so that we can run tests on nodes with less gpus. 2025-12-04T10:32:31.7090508Z  echo "Error: only $ngpu GPU(s) detected, at least 2 GPUs are needed for distributed jobs" 2025-12-04T10:32:31.7090696Z  exit 1 2025-12-04T10:32:31.7090796Z fi 2025-12-04T10:32:31.7095107Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:31.7095259Z env: 2025-12-04T10:32:31.7095359Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:31.7095501Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:31.7095699Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:31.7095876Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:31.7096410Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:31.7096908Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:31.7097030Z AWS_REGION: us-east-1 2025-12-04T10:32:31.7097322Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:31.7097484Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:31.7099499Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:31.7099651Z ##[endgroup] 2025-12-04T10:32:31.8237598Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main 2025-12-04T10:32:31.8237812Z with: 2025-12-04T10:32:31.8238310Z docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:31.8238648Z use-custom-docker-registry: true 2025-12-04T10:32:31.8238796Z docker-build-dir: .ci/docker 2025-12-04T10:32:31.8238939Z docker-build-script: ./build.sh 2025-12-04T10:32:31.8239076Z working-directory: . 2025-12-04T10:32:31.8239236Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:31.8239415Z force-push: false 2025-12-04T10:32:31.8239529Z env: 2025-12-04T10:32:31.8239694Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:31.8239852Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:31.8240046Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:31.8240256Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:31.8240819Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:31.8241370Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:31.8241500Z AWS_REGION: us-east-1 2025-12-04T10:32:31.8241776Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:31.8241947Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:31.8244132Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:31.8244257Z ##[endgroup] 2025-12-04T10:32:31.8255079Z ##[group]Run set -ex 2025-12-04T10:32:31.8255221Z set -ex 2025-12-04T10:32:31.8255317Z  2025-12-04T10:32:31.8255473Z # If the docker build directory or the build script doesn't exist, the action will 2025-12-04T10:32:31.8255718Z # gracefully return the docker image name as it is. Pulling docker image in Linux 2025-12-04T10:32:31.8255929Z # job could then download the pre-built image as usual 2025-12-04T10:32:31.8256187Z if [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then 2025-12-04T10:32:31.8256423Z  echo "skip=false" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8256557Z else 2025-12-04T10:32:31.8256668Z  echo "skip=true" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8256843Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8257002Z  2025-12-04T10:32:31.8257206Z  echo "Not using custom ECR registry. Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..." 2025-12-04T10:32:31.8257438Z  exit 0 2025-12-04T10:32:31.8257533Z fi 2025-12-04T10:32:31.8257626Z  2025-12-04T10:32:31.8257764Z if [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then 2025-12-04T10:32:31.8257992Z  # The docker image name already includes the ECR prefix and tag, so we can just 2025-12-04T10:32:31.8258194Z  # use it as it is, but first let's extract the tag 2025-12-04T10:32:31.8258383Z  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}') 2025-12-04T10:32:31.8258574Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8258759Z  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8258915Z else 2025-12-04T10:32:31.8259027Z  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then 2025-12-04T10:32:31.8259181Z  CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:} 2025-12-04T10:32:31.8259334Z  DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*} 2025-12-04T10:32:31.8259462Z  fi 2025-12-04T10:32:31.8259930Z  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}") 2025-12-04T10:32:31.8260159Z  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8260439Z  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8260691Z  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8260851Z fi 2025-12-04T10:32:31.8265071Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:31.8265217Z env: 2025-12-04T10:32:31.8265309Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:31.8265450Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:31.8265629Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:31.8265798Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:31.8266310Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:31.8266804Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:31.8266923Z AWS_REGION: us-east-1 2025-12-04T10:32:31.8267071Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:31.8267229Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:31.8269395Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:31.8269507Z REPO_NAME: pytorch 2025-12-04T10:32:31.8269845Z DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:31.8270138Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T10:32:31.8270259Z DOCKER_BUILD_SCRIPT: ./build.sh 2025-12-04T10:32:31.8270411Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:31.8270575Z USE_CUSTOM_DOCKER_REGISTRY: true 2025-12-04T10:32:31.8270694Z CUSTOM_TAG_PREFIX: 2025-12-04T10:32:31.8270797Z ##[endgroup] 2025-12-04T10:32:31.8288893Z + [[ -d .ci/docker ]] 2025-12-04T10:32:31.8289012Z + [[ -f .ci/docker/./build.sh ]] 2025-12-04T10:32:31.8289136Z + [[ true == \t\r\u\e ]] 2025-12-04T10:32:31.8289241Z + echo skip=false 2025-12-04T10:32:31.8289649Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]] 2025-12-04T10:32:31.8295017Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:31.8296472Z ++ awk -F '[:,]' '{print $2}' 2025-12-04T10:32:31.8307401Z + DOCKER_TAG=pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:31.8307684Z + echo docker-tag=pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:31.8308119Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:31.8347599Z ##[group]Run set +e 2025-12-04T10:32:31.8347750Z set +e 2025-12-04T10:32:31.8347853Z set -x 2025-12-04T10:32:31.8347951Z  2025-12-04T10:32:31.8348045Z login() { 2025-12-04T10:32:31.8348252Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T10:32:31.8348455Z } 2025-12-04T10:32:31.8348547Z  2025-12-04T10:32:31.8348642Z retry () { 2025-12-04T10:32:31.8348765Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T10:32:31.8348897Z } 2025-12-04T10:32:31.8348990Z  2025-12-04T10:32:31.8349092Z retry login "${DOCKER_REGISTRY}" 2025-12-04T10:32:31.8349218Z  2025-12-04T10:32:31.8349464Z START_TIME=$(date +%s) 2025-12-04T10:32:31.8349746Z # Wait up to 120 minutes 2025-12-04T10:32:31.8350047Z while [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do 2025-12-04T10:32:31.8350246Z  # Check if image already exists, if it does then skip building it 2025-12-04T10:32:31.8350446Z  if docker manifest inspect "${DOCKER_IMAGE}"; then 2025-12-04T10:32:31.8350594Z  exit 0 2025-12-04T10:32:31.8350697Z  fi 2025-12-04T10:32:31.8350793Z  2025-12-04T10:32:31.8350951Z  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can 2025-12-04T10:32:31.8351208Z  # use this to differentiate between the Docker build and regular build jobs. For the 2025-12-04T10:32:31.8351462Z  # latter, it will wait for the Docker images to become available before continuing 2025-12-04T10:32:31.8351670Z  if [ "${DOCKER_PUSH:-false}" == "true" ]; then 2025-12-04T10:32:31.8351844Z  # It's a Docker build job, let's build the image 2025-12-04T10:32:31.8351984Z  break 2025-12-04T10:32:31.8352093Z  else 2025-12-04T10:32:31.8352232Z  # It's a regular build job, wait for the image to become available 2025-12-04T10:32:31.8352392Z  sleep 300 2025-12-04T10:32:31.8352498Z  fi 2025-12-04T10:32:31.8352592Z done 2025-12-04T10:32:31.8352685Z  2025-12-04T10:32:31.8352827Z # NB: This part requires a full checkout. Otherwise, the merge base will 2025-12-04T10:32:31.8353040Z # be empty. The default action would be to continue rebuild the image 2025-12-04T10:32:31.8353235Z if [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then 2025-12-04T10:32:31.8353412Z  # if we're on the base branch then use the parent commit 2025-12-04T10:32:31.8353569Z  MERGE_BASE=$(git rev-parse HEAD~) 2025-12-04T10:32:31.8353698Z else 2025-12-04T10:32:31.8353828Z  # otherwise we're on a PR, so use the most recent base commit 2025-12-04T10:32:31.8354009Z  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION") 2025-12-04T10:32:31.8354155Z fi 2025-12-04T10:32:31.8354244Z  2025-12-04T10:32:31.8354344Z if [[ -z "${MERGE_BASE}" ]]; then 2025-12-04T10:32:31.8354491Z  echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8354623Z  2025-12-04T10:32:31.8354801Z  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..." 2025-12-04T10:32:31.8355003Z  exit 0 2025-12-04T10:32:31.8355099Z fi 2025-12-04T10:32:31.8355189Z  2025-12-04T10:32:31.8355316Z if ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then 2025-12-04T10:32:31.8355568Z  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit" 2025-12-04T10:32:31.8355791Z  exit 1 2025-12-04T10:32:31.8355887Z fi 2025-12-04T10:32:31.8355978Z  2025-12-04T10:32:31.8356126Z PREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}") 2025-12-04T10:32:31.8356370Z # If no image exists but the hash is the same as the previous hash then we should error out here 2025-12-04T10:32:31.8356590Z if [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then 2025-12-04T10:32:31.8356842Z  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch" 2025-12-04T10:32:31.8357119Z  echo " Will re-build docker image to store in local cache, TTS may be longer" 2025-12-04T10:32:31.8357291Z fi 2025-12-04T10:32:31.8357383Z  2025-12-04T10:32:31.8357495Z echo "rebuild=true" >> "${GITHUB_OUTPUT}" 2025-12-04T10:32:31.8360608Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:31.8360805Z env: 2025-12-04T10:32:31.8360904Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:31.8361048Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:31.8361283Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:31.8361452Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:31.8361963Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:31.8362468Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:31.8362590Z AWS_REGION: us-east-1 2025-12-04T10:32:31.8362751Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:31.8362910Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:31.8364905Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:31.8365023Z DOCKER_BUILD_DIR: .ci/docker 2025-12-04T10:32:31.8365166Z BASE_REVISION: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:32:31.8365480Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:31.8365843Z DOCKER_TAG: pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:31.8366073Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:31.8366225Z DOCKER_PUSH: 2025-12-04T10:32:31.8366323Z ##[endgroup] 2025-12-04T10:32:31.8385623Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:31.8385797Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:31.8388724Z + aws ecr get-login-password --region us-east-1 2025-12-04T10:32:31.8388930Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:31.8389206Z /home/runner/_work/_temp/f1912837-f3f9-4b32-8dc8-31249691bcf9.sh: line 5: aws: command not found 2025-12-04T10:32:31.8465480Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T10:32:31.8474178Z + sleep 1 2025-12-04T10:32:32.8484958Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:32.8488181Z + aws ecr get-login-password --region us-east-1 2025-12-04T10:32:32.8488585Z /home/runner/_work/_temp/f1912837-f3f9-4b32-8dc8-31249691bcf9.sh: line 5: aws: command not found 2025-12-04T10:32:32.8489252Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:32.8582014Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T10:32:32.8592693Z + sleep 2 2025-12-04T10:32:34.8604926Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:34.8608808Z + aws ecr get-login-password --region us-east-1 2025-12-04T10:32:34.8609323Z /home/runner/_work/_temp/f1912837-f3f9-4b32-8dc8-31249691bcf9.sh: line 5: aws: command not found 2025-12-04T10:32:34.8609997Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:34.8705173Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T10:32:34.8718116Z ++ date +%s 2025-12-04T10:32:34.8725210Z + START_TIME=1764844354 2025-12-04T10:32:34.8729405Z ++ date +%s 2025-12-04T10:32:34.8739645Z + [[ 1764837154 -lt 1764844354 ]] 2025-12-04T10:32:34.8740133Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:36.2737186Z { 2025-12-04T10:32:36.2737437Z "schemaVersion": 2, 2025-12-04T10:32:36.2737769Z "mediaType": "application/vnd.docker.distribution.manifest.v2+json", 2025-12-04T10:32:36.2738161Z "config": { 2025-12-04T10:32:36.2738405Z "mediaType": "application/vnd.docker.container.image.v1+json", 2025-12-04T10:32:36.2738674Z "size": 30520, 2025-12-04T10:32:36.2738977Z "digest": "sha256:45252333063339f104d56e41f20304e9511ab21c7768e8d156b95ddf24a9dbe5" 2025-12-04T10:32:36.2739995Z }, 2025-12-04T10:32:36.2740133Z "layers": [ 2025-12-04T10:32:36.2740282Z { 2025-12-04T10:32:36.2740508Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2740935Z "size": 30447951, 2025-12-04T10:32:36.2741211Z "digest": "sha256:63e5bc7682b85ae57a1221210f64d62e7a90b0a30f19af4ca734b8242ae49d63" 2025-12-04T10:32:36.2741506Z }, 2025-12-04T10:32:36.2741644Z { 2025-12-04T10:32:36.2741862Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2742127Z "size": 1554, 2025-12-04T10:32:36.2742397Z "digest": "sha256:835841cca3b7e1464290cdb78e48773e03583413fbed852c3cc5165a392ea44d" 2025-12-04T10:32:36.2742686Z }, 2025-12-04T10:32:36.2742817Z { 2025-12-04T10:32:36.2743031Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2743294Z "size": 313275691, 2025-12-04T10:32:36.2743573Z "digest": "sha256:aac69780afc8611a5f94a235792d39ae055249c8319ef43b78675998a9b2f825" 2025-12-04T10:32:36.2743862Z }, 2025-12-04T10:32:36.2743995Z { 2025-12-04T10:32:36.2744210Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2744476Z "size": 704, 2025-12-04T10:32:36.2744750Z "digest": "sha256:029495b23122c840ca0e52d487afa8d2c4dbf1991cd7f204ec3e434dcf947bf4" 2025-12-04T10:32:36.2745055Z }, 2025-12-04T10:32:36.2745190Z { 2025-12-04T10:32:36.2745396Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2745657Z "size": 1218, 2025-12-04T10:32:36.2745953Z "digest": "sha256:d0fb85b008332051a3f7c052721ef68bde404b46c23fa43ad040373bd367826c" 2025-12-04T10:32:36.2746244Z }, 2025-12-04T10:32:36.2746370Z { 2025-12-04T10:32:36.2746580Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2746846Z "size": 484, 2025-12-04T10:32:36.2747135Z "digest": "sha256:59b63930883363c7d2aaab27cc61555d9f3e119dc18247a8624c98ebdaa354a5" 2025-12-04T10:32:36.2747367Z }, 2025-12-04T10:32:36.2747480Z { 2025-12-04T10:32:36.2747654Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2747877Z "size": 110363202, 2025-12-04T10:32:36.2748106Z "digest": "sha256:dc112c89d57aa1e85082e40a56e5bc743d64f834ae2f98afe91f60c248354d38" 2025-12-04T10:32:36.2748337Z }, 2025-12-04T10:32:36.2748453Z { 2025-12-04T10:32:36.2748620Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2748826Z "size": 4436, 2025-12-04T10:32:36.2749036Z "digest": "sha256:522eab2402e5001810155ef7eb56940b7c01a4fef62ac588886981c3b8ee8e1e" 2025-12-04T10:32:36.2749272Z }, 2025-12-04T10:32:36.2749377Z { 2025-12-04T10:32:36.2749547Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2749827Z "size": 1755, 2025-12-04T10:32:36.2750050Z "digest": "sha256:2b5a11b41761d8ea3b829e4772e4064cb6c4e4989126af324d0057661e4493a1" 2025-12-04T10:32:36.2750282Z }, 2025-12-04T10:32:36.2750388Z { 2025-12-04T10:32:36.2750567Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2750779Z "size": 724, 2025-12-04T10:32:36.2750993Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T10:32:36.2751235Z }, 2025-12-04T10:32:36.2751340Z { 2025-12-04T10:32:36.2751506Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2751727Z "size": 3185588166, 2025-12-04T10:32:36.2751951Z "digest": "sha256:73e33534e9eb94cf29418d65944168962b65fe21f55e9b8bad18c76e9b3a37b8" 2025-12-04T10:32:36.2752188Z }, 2025-12-04T10:32:36.2752304Z { 2025-12-04T10:32:36.2752469Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2752676Z "size": 396, 2025-12-04T10:32:36.2752895Z "digest": "sha256:5bfdaeb5578d6ffcd7db29c48303cbceb13c591210feaa216a8daa7a6d445b4b" 2025-12-04T10:32:36.2753137Z }, 2025-12-04T10:32:36.2753241Z { 2025-12-04T10:32:36.2753480Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2753690Z "size": 236863, 2025-12-04T10:32:36.2753909Z "digest": "sha256:c07d27e4d3a5ba4ad5325bb785b2e4f058fe5e10ec1aeeb413a1e152b073f203" 2025-12-04T10:32:36.2754201Z }, 2025-12-04T10:32:36.2754310Z { 2025-12-04T10:32:36.2754474Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2754684Z "size": 787, 2025-12-04T10:32:36.2754901Z "digest": "sha256:b21856d1bf420da6fa8ec7331b82ab355d4f4178644e7d3a3d3d0fbc3610109a" 2025-12-04T10:32:36.2755140Z }, 2025-12-04T10:32:36.2755249Z { 2025-12-04T10:32:36.2755429Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2755639Z "size": 106, 2025-12-04T10:32:36.2755855Z "digest": "sha256:cb19d84867e4063f55db9459c28c50a2abc37c06d3c1ca82ba95fa8427cc438a" 2025-12-04T10:32:36.2756095Z }, 2025-12-04T10:32:36.2756203Z { 2025-12-04T10:32:36.2756371Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2756591Z "size": 1496, 2025-12-04T10:32:36.2756815Z "digest": "sha256:8165374f8dccf88a7791a5d31afbe29e4d4542b4f1cf1904945e07f9af6bf8ba" 2025-12-04T10:32:36.2757050Z }, 2025-12-04T10:32:36.2757139Z { 2025-12-04T10:32:36.2757273Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2757439Z "size": 458789560, 2025-12-04T10:32:36.2757629Z "digest": "sha256:1aecc77354ceba59ec6f0d37a558f2dbb6d5c0854553ee8505ac8707b422da6d" 2025-12-04T10:32:36.2757817Z }, 2025-12-04T10:32:36.2757905Z { 2025-12-04T10:32:36.2758039Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2758204Z "size": 164, 2025-12-04T10:32:36.2758371Z "digest": "sha256:465d3fd643aa2ea0ad07335cda66f12f1d7e5e800c4e9385ec466bc8a1ceabda" 2025-12-04T10:32:36.2758565Z }, 2025-12-04T10:32:36.2758653Z { 2025-12-04T10:32:36.2758783Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2758994Z "size": 104, 2025-12-04T10:32:36.2759341Z "digest": "sha256:6c503e779d6f41ca7f51309875df2b725c171926aece7009c4b8a64d1ba3f58e" 2025-12-04T10:32:36.2759686Z }, 2025-12-04T10:32:36.2759781Z { 2025-12-04T10:32:36.2759918Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2760094Z "size": 724, 2025-12-04T10:32:36.2760266Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T10:32:36.2760453Z }, 2025-12-04T10:32:36.2760540Z { 2025-12-04T10:32:36.2760675Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2760849Z "size": 196, 2025-12-04T10:32:36.2761003Z + exit 0 2025-12-04T10:32:36.2761172Z "digest": "sha256:f7e9a021f0ee3d11a50dcb96378af8103a21f6c3c142f54529207648f3ed00b2" 2025-12-04T10:32:36.2761362Z }, 2025-12-04T10:32:36.2761446Z { 2025-12-04T10:32:36.2761582Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2761750Z "size": 2583, 2025-12-04T10:32:36.2761915Z "digest": "sha256:8e023b349080fb11ee55491bc9b842b30e9e3a90246d05b303a73dc62038caf2" 2025-12-04T10:32:36.2762103Z }, 2025-12-04T10:32:36.2762189Z { 2025-12-04T10:32:36.2762323Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2762496Z "size": 7577171420, 2025-12-04T10:32:36.2762670Z "digest": "sha256:8188df80e595a3dbcf84623c6a58a655269898cbb60029435f136d7f9d34ccaa" 2025-12-04T10:32:36.2762857Z }, 2025-12-04T10:32:36.2762938Z { 2025-12-04T10:32:36.2763073Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2763242Z "size": 135, 2025-12-04T10:32:36.2763421Z "digest": "sha256:3c2c2f8c74bfa16c4bf9a832c97bbb1d55205b2b4a2cead02cf74301ca1001fb" 2025-12-04T10:32:36.2763609Z }, 2025-12-04T10:32:36.2763696Z { 2025-12-04T10:32:36.2763832Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2763995Z "size": 104, 2025-12-04T10:32:36.2764219Z "digest": "sha256:2aa7784fbe3300f8bbfb6bb51cff3b01fd091e829c2bc7ab9e25261a0dd9b3bd" 2025-12-04T10:32:36.2764411Z }, 2025-12-04T10:32:36.2764495Z { 2025-12-04T10:32:36.2764674Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2764838Z "size": 612, 2025-12-04T10:32:36.2765010Z "digest": "sha256:2b3b5215d3ebe8789f0444457bfd5a6e218289b64aa07653ac3d03ddda5e6708" 2025-12-04T10:32:36.2765195Z }, 2025-12-04T10:32:36.2765281Z { 2025-12-04T10:32:36.2765418Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2765585Z "size": 838191945, 2025-12-04T10:32:36.2765764Z "digest": "sha256:99b1f1ea3e857834cebd01763d90fbd700aeb9c2d2ef23eda2cfff5652c9708b" 2025-12-04T10:32:36.2765954Z }, 2025-12-04T10:32:36.2766041Z { 2025-12-04T10:32:36.2766170Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2766356Z "size": 111, 2025-12-04T10:32:36.2766528Z "digest": "sha256:18d6daba0a5768a37ad106b57974f6b7efd35c43a87c246bcd3f43fea88f2d2b" 2025-12-04T10:32:36.2766719Z }, 2025-12-04T10:32:36.2766803Z { 2025-12-04T10:32:36.2766934Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2767087Z "size": 1555, 2025-12-04T10:32:36.2767243Z "digest": "sha256:5277f2a503ebd17ba9d9b86cc9bac86265504adeb449c0647616ddaacd3cbc41" 2025-12-04T10:32:36.2767414Z }, 2025-12-04T10:32:36.2767489Z { 2025-12-04T10:32:36.2767610Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2767760Z "size": 107, 2025-12-04T10:32:36.2767911Z "digest": "sha256:3198a9717aace920fd5de085319adf75091af05fc4318ce4b16a8a5b0e8d449e" 2025-12-04T10:32:36.2768084Z }, 2025-12-04T10:32:36.2768160Z { 2025-12-04T10:32:36.2768280Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2768430Z "size": 166, 2025-12-04T10:32:36.2768580Z "digest": "sha256:99a4918e5808277879449e97ccd7190db6b9aa2d742b57a3b831ce0198522bdd" 2025-12-04T10:32:36.2768750Z }, 2025-12-04T10:32:36.2768830Z { 2025-12-04T10:32:36.2768950Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2769102Z "size": 3526081, 2025-12-04T10:32:36.2769259Z "digest": "sha256:15bb11dfc6acc3537d527d6771c8e711e5605e99f82ec41e805d4600b8a97516" 2025-12-04T10:32:36.2769428Z }, 2025-12-04T10:32:36.2769507Z { 2025-12-04T10:32:36.2769687Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2769836Z "size": 107, 2025-12-04T10:32:36.2769990Z "digest": "sha256:bd87c8766e90e33db17514558ac591cc3f4149afd7abeaef4dd5770bbfa14210" 2025-12-04T10:32:36.2770161Z }, 2025-12-04T10:32:36.2770238Z { 2025-12-04T10:32:36.2770360Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2770510Z "size": 829, 2025-12-04T10:32:36.2770661Z "digest": "sha256:1969e15d0c13874ea5883ed829235a19ef6dc21c8aa6172032b78a8ffa6ff262" 2025-12-04T10:32:36.2770830Z }, 2025-12-04T10:32:36.2770908Z { 2025-12-04T10:32:36.2771028Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2771182Z "size": 26973054, 2025-12-04T10:32:36.2771341Z "digest": "sha256:24a03847d382b73c11969f8f73916a6bedf5ccea12f6f4290b3880f29ceda32a" 2025-12-04T10:32:36.2771508Z }, 2025-12-04T10:32:36.2771585Z { 2025-12-04T10:32:36.2771706Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2771856Z "size": 104, 2025-12-04T10:32:36.2772009Z "digest": "sha256:816e2e34e01839a35d624dbf4bd9ac9bea4c975104af47a0e6b6b6dee6c6f98d" 2025-12-04T10:32:36.2772180Z }, 2025-12-04T10:32:36.2772257Z { 2025-12-04T10:32:36.2772377Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2772527Z "size": 424, 2025-12-04T10:32:36.2772680Z "digest": "sha256:b168858b85373f8ddca549d79267a06de4fa945d04bf791c55c9ddc93957fa3c" 2025-12-04T10:32:36.2772847Z }, 2025-12-04T10:32:36.2772924Z { 2025-12-04T10:32:36.2773091Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2773243Z "size": 19309386, 2025-12-04T10:32:36.2773452Z "digest": "sha256:6b8d5ff02e267e38322afbb8a58ed63ce9d75b10e9e73255e6affcbc6b6539bf" 2025-12-04T10:32:36.2773623Z }, 2025-12-04T10:32:36.2773697Z { 2025-12-04T10:32:36.2773818Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2773969Z "size": 826, 2025-12-04T10:32:36.2774122Z "digest": "sha256:4e3b10a5dd6aed29f238d604925e2a4f873141c1087c8dd4fdde5c61e7560893" 2025-12-04T10:32:36.2774292Z }, 2025-12-04T10:32:36.2774367Z { 2025-12-04T10:32:36.2774488Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2774638Z "size": 724, 2025-12-04T10:32:36.2774786Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T10:32:36.2774952Z }, 2025-12-04T10:32:36.2775030Z { 2025-12-04T10:32:36.2775156Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2775307Z "size": 149, 2025-12-04T10:32:36.2775460Z "digest": "sha256:3092fab73b59190b9facfc49bf18f58612172bc2fd68dfa339a1118632616939" 2025-12-04T10:32:36.2775631Z }, 2025-12-04T10:32:36.2775708Z { 2025-12-04T10:32:36.2775830Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2775979Z "size": 136, 2025-12-04T10:32:36.2776134Z "digest": "sha256:20020dd28a15ba092fcbfe906ee39cdddfcc9d0b7eb42fdd6f4c08a984fa9c00" 2025-12-04T10:32:36.2776308Z }, 2025-12-04T10:32:36.2776383Z { 2025-12-04T10:32:36.2776503Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2776653Z "size": 140, 2025-12-04T10:32:36.2776806Z "digest": "sha256:ae5280ce969dcff08c091e9a5f7641f13561b2b0ee44d78b7c3f81d8fe8e6d32" 2025-12-04T10:32:36.2776977Z }, 2025-12-04T10:32:36.2777054Z { 2025-12-04T10:32:36.2777175Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2777328Z "size": 32, 2025-12-04T10:32:36.2777482Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T10:32:36.2777656Z }, 2025-12-04T10:32:36.2777731Z { 2025-12-04T10:32:36.2777852Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2778002Z "size": 222, 2025-12-04T10:32:36.2778156Z "digest": "sha256:fe17d9eb0fd26d3af4c724bf570d833978b131cedb7dc17a800aa388a246b3cd" 2025-12-04T10:32:36.2778326Z }, 2025-12-04T10:32:36.2778405Z { 2025-12-04T10:32:36.2778527Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2778678Z "size": 346, 2025-12-04T10:32:36.2778830Z "digest": "sha256:a51e0dab2d596e6563483f27c12660007160847d177ba4c31812a8f44ada5754" 2025-12-04T10:32:36.2778996Z }, 2025-12-04T10:32:36.2779073Z { 2025-12-04T10:32:36.2779195Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2779348Z "size": 88300, 2025-12-04T10:32:36.2779511Z "digest": "sha256:6eb176cefd72d37ecbcdf074289a8f1de732d8816cc695ece7e4709d098094d6" 2025-12-04T10:32:36.2779723Z }, 2025-12-04T10:32:36.2779803Z { 2025-12-04T10:32:36.2779923Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2780074Z "size": 106, 2025-12-04T10:32:36.2780228Z "digest": "sha256:e7b8cf2e8d5a4c56db9726ce62c1176032408b3b1c25a000592361cb4245e2b5" 2025-12-04T10:32:36.2780397Z }, 2025-12-04T10:32:36.2780474Z { 2025-12-04T10:32:36.2780595Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2780746Z "size": 1671, 2025-12-04T10:32:36.2780902Z "digest": "sha256:ef3a5060abce88884bc8bd815aa41c46427f34eeb132fe0ddd85a3f86e6dc83d" 2025-12-04T10:32:36.2781073Z }, 2025-12-04T10:32:36.2781149Z { 2025-12-04T10:32:36.2781271Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2781424Z "size": 724, 2025-12-04T10:32:36.2781620Z "digest": "sha256:9681563a88ff9e62494a2740e537440d3df978d466c9478d6a941fae8b57b084" 2025-12-04T10:32:36.2781789Z }, 2025-12-04T10:32:36.2781865Z { 2025-12-04T10:32:36.2782022Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2782176Z "size": 138, 2025-12-04T10:32:36.2782332Z "digest": "sha256:a6f4ec14b42b8f0a83d20aa6a985ddb6a1bf64e0ed3d44afd3484b87d4ed5ad3" 2025-12-04T10:32:36.2782506Z }, 2025-12-04T10:32:36.2782582Z { 2025-12-04T10:32:36.2782701Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2782849Z "size": 119, 2025-12-04T10:32:36.2783002Z "digest": "sha256:7e5a0c956cfbd6f8074fbfd3b1d416e6635d632835ec00c8dd4c015a21da19b4" 2025-12-04T10:32:36.2783172Z }, 2025-12-04T10:32:36.2783247Z { 2025-12-04T10:32:36.2783370Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2783522Z "size": 6238423049, 2025-12-04T10:32:36.2783690Z "digest": "sha256:b4f78730cfe76ce091b78b2e2e3d52be03f1097b3e4c3de5bd79f8d13a853132" 2025-12-04T10:32:36.2783862Z }, 2025-12-04T10:32:36.2783939Z { 2025-12-04T10:32:36.2784060Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2784214Z "size": 174, 2025-12-04T10:32:36.2784364Z "digest": "sha256:081028f24389b112683689fd362e8c0d6f358082710e72feab91cea6383feb4d" 2025-12-04T10:32:36.2784529Z }, 2025-12-04T10:32:36.2784605Z { 2025-12-04T10:32:36.2784729Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2784879Z "size": 1896, 2025-12-04T10:32:36.2785037Z "digest": "sha256:a534dcf4b9a9e5fabed742c8a8fc43c9cfe7346ea88ab3c177c3b14fd3afe00a" 2025-12-04T10:32:36.2785210Z }, 2025-12-04T10:32:36.2785286Z { 2025-12-04T10:32:36.2785407Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2785558Z "size": 197577597, 2025-12-04T10:32:36.2785715Z "digest": "sha256:2e77500302cc13224427e1d74e471bd79d5109ba6a5099a83df1d10b786f71ba" 2025-12-04T10:32:36.2785885Z }, 2025-12-04T10:32:36.2785961Z { 2025-12-04T10:32:36.2786083Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2786238Z "size": 304, 2025-12-04T10:32:36.2786466Z "digest": "sha256:bc08246bb4ba18c3ec5bc69e16b6b4e929c5bd0f3fae10eeb0b1a622a63d6fa2" 2025-12-04T10:32:36.2786639Z }, 2025-12-04T10:32:36.2786717Z { 2025-12-04T10:32:36.2786840Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2786992Z "size": 32, 2025-12-04T10:32:36.2787147Z "digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1" 2025-12-04T10:32:36.2787316Z }, 2025-12-04T10:32:36.2787391Z { 2025-12-04T10:32:36.2787511Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2787661Z "size": 106, 2025-12-04T10:32:36.2787815Z "digest": "sha256:ff0c473ca120ebdcaa2ba10b3274e82032edd5196019e76d4e7584553704ae81" 2025-12-04T10:32:36.2787986Z }, 2025-12-04T10:32:36.2788065Z { 2025-12-04T10:32:36.2788187Z "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", 2025-12-04T10:32:36.2788339Z "size": 54145662, 2025-12-04T10:32:36.2788503Z "digest": "sha256:6bbc14b250efb3cdaad12c91573c6bb9129ad3e3432f0ed1a7eaebc9958d162f" 2025-12-04T10:32:36.2788675Z } 2025-12-04T10:32:36.2788750Z ] 2025-12-04T10:32:36.2788828Z } 2025-12-04T10:32:36.2804646Z ##[group]Run set -eux 2025-12-04T10:32:36.2804763Z set -eux 2025-12-04T10:32:36.2804922Z # It's ok if this steps fails, it would then be an anonymous user like what we used to have 2025-12-04T10:32:36.2805345Z aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true 2025-12-04T10:32:36.2810001Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:36.2810150Z env: 2025-12-04T10:32:36.2810242Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:36.2810435Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:36.2810610Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:36.2810814Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:36.2811324Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:36.2811816Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:36.2811930Z AWS_REGION: us-east-1 2025-12-04T10:32:36.2812150Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:36.2812304Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:36.2814296Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:36.2814400Z ##[endgroup] 2025-12-04T10:32:36.2837524Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token 2025-12-04T10:32:36.2838080Z /home/runner/_work/_temp/2d96020d-af1a-42cb-b736-57bd79e6fe84.sh: line 3: aws: command not found 2025-12-04T10:32:36.2838524Z + jq --raw-output .SecretString 2025-12-04T10:32:36.2839888Z + jq -r .docker_hub_readonly_token 2025-12-04T10:32:36.2841223Z + docker login --username pytorchbot --password-stdin 2025-12-04T10:32:36.2942082Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T10:32:36.2949753Z + true 2025-12-04T10:32:36.3001617Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main 2025-12-04T10:32:36.3001793Z with: 2025-12-04T10:32:36.3002061Z docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:36.3002394Z docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:36.3002548Z env: 2025-12-04T10:32:36.3002641Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:36.3002782Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:36.3002961Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:36.3003139Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:36.3003657Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:36.3004152Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:36.3004268Z AWS_REGION: us-east-1 2025-12-04T10:32:36.3004407Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:36.3004563Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:36.3006578Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:36.3006681Z ##[endgroup] 2025-12-04T10:32:36.3013309Z ##[group]Run set -x 2025-12-04T10:32:36.3013417Z set -x 2025-12-04T10:32:36.3013507Z set +e 2025-12-04T10:32:36.3013596Z  2025-12-04T10:32:36.3013681Z login() { 2025-12-04T10:32:36.3013865Z  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1" 2025-12-04T10:32:36.3014059Z } 2025-12-04T10:32:36.3014143Z  2025-12-04T10:32:36.3014227Z retry () { 2025-12-04T10:32:36.3014338Z  $* || (sleep 1 && $*) || (sleep 2 && $*) 2025-12-04T10:32:36.3014461Z } 2025-12-04T10:32:36.3014546Z  2025-12-04T10:32:36.3014639Z retry login "${DOCKER_REGISTRY}" 2025-12-04T10:32:36.3014756Z  2025-12-04T10:32:36.3014939Z IMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024') 2025-12-04T10:32:36.3015184Z echo "Compressed size of image in MB: ${IMAGE_SIZE}" 2025-12-04T10:32:36.3015329Z  2025-12-04T10:32:36.3015411Z set -e 2025-12-04T10:32:36.3015544Z # ignore output since only exit code is used for conditional 2025-12-04T10:32:36.3015727Z # only pull docker image if it's not available locally 2025-12-04T10:32:36.3015984Z if ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then 2025-12-04T10:32:36.3016170Z  retry docker pull "${DOCKER_IMAGE}" 2025-12-04T10:32:36.3016293Z fi 2025-12-04T10:32:36.3018863Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:36.3019010Z env: 2025-12-04T10:32:36.3019102Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:36.3019236Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:36.3019408Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:36.3019619Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:36.3020116Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:36.3020610Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:36.3020725Z AWS_REGION: us-east-1 2025-12-04T10:32:36.3020863Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:36.3021020Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:36.3023006Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:36.3023357Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:36.3023672Z DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:36.3023820Z ##[endgroup] 2025-12-04T10:32:36.3040899Z + set +e 2025-12-04T10:32:36.3041115Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:36.3041304Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:36.3044354Z + aws ecr get-login-password --region us-east-1 2025-12-04T10:32:36.3044573Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:36.3044846Z /home/runner/_work/_temp/58254ef0-9b6c-4873-95c4-52ff485189c5.sh: line 5: aws: command not found 2025-12-04T10:32:36.3116951Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T10:32:36.3127107Z + sleep 1 2025-12-04T10:32:37.3138457Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:37.3141494Z + aws ecr get-login-password --region us-east-1 2025-12-04T10:32:37.3141880Z /home/runner/_work/_temp/58254ef0-9b6c-4873-95c4-52ff485189c5.sh: line 5: aws: command not found 2025-12-04T10:32:37.3142438Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:37.3244145Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T10:32:37.3255785Z + sleep 2 2025-12-04T10:32:39.3267509Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:39.3270229Z + aws ecr get-login-password --region us-east-1 2025-12-04T10:32:39.3270790Z /home/runner/_work/_temp/58254ef0-9b6c-4873-95c4-52ff485189c5.sh: line 5: aws: command not found 2025-12-04T10:32:39.3272512Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T10:32:39.3360956Z Error: Cannot perform an interactive login from a non TTY device 2025-12-04T10:32:39.3379737Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:39.3380404Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024' 2025-12-04T10:32:40.7112541Z + IMAGE_SIZE=18171.470620155334 2025-12-04T10:32:40.7112761Z + echo 'Compressed size of image in MB: 18171.470620155334' 2025-12-04T10:32:40.7112931Z + set -e 2025-12-04T10:32:40.7113225Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:32:40.7113553Z Compressed size of image in MB: 18171.470620155334 2025-12-04T10:32:40.7303199Z Prepare all required actions 2025-12-04T10:32:40.7317735Z ##[group]Run ./.github/actions/get-workflow-job-id 2025-12-04T10:32:40.7317876Z with: 2025-12-04T10:32:40.7318120Z github-token: *** 2025-12-04T10:32:40.7318218Z env: 2025-12-04T10:32:40.7318308Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:40.7318442Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:40.7318616Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:40.7318799Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:40.7319304Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:40.7319856Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:40.7319983Z AWS_REGION: us-east-1 2025-12-04T10:32:40.7320104Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:40.7320270Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:40.7322253Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:40.7322355Z ##[endgroup] 2025-12-04T10:32:40.7328383Z ##[group]Run set -eux 2025-12-04T10:32:40.7328497Z set -eux 2025-12-04T10:32:40.7328671Z python3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}" 2025-12-04T10:32:40.7332639Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:32:40.7332785Z env: 2025-12-04T10:32:40.7332878Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:40.7333014Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:40.7333188Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:40.7333353Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:40.7333861Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:40.7334349Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:40.7334465Z AWS_REGION: us-east-1 2025-12-04T10:32:40.7334599Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:40.7334861Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:40.7336839Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:40.7336990Z GITHUB_TOKEN: *** 2025-12-04T10:32:40.7337085Z ##[endgroup] 2025-12-04T10:32:40.7353270Z + python3 .github/scripts/get_workflow_job_id.py 19922849170 linux.rocm.gpu.gfx942.4.b-bphpw-runner-5l4hk 2025-12-04T10:32:41.9338791Z Setting output job-id=57116213187 2025-12-04T10:32:41.9339494Z Setting output job-name=linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:32:41.9438961Z Prepare all required actions 2025-12-04T10:32:41.9439200Z Getting action download info 2025-12-04T10:32:42.1630306Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6) 2025-12-04T10:32:43.2837396Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093) 2025-12-04T10:32:44.3142136Z ##[group]Run ./.github/actions/download-build-artifacts 2025-12-04T10:32:44.3142328Z with: 2025-12-04T10:32:44.3142450Z name: linux-jammy-rocm-py3.10 2025-12-04T10:32:44.3142596Z s3-bucket: gha-artifacts 2025-12-04T10:32:44.3142725Z env: 2025-12-04T10:32:44.3142844Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:44.3142996Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:44.3143200Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:44.3143385Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:44.3143975Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:44.3144639Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:44.3144775Z AWS_REGION: us-east-1 2025-12-04T10:32:44.3144984Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:44.3145154Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:44.3147194Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:44.3147315Z ##[endgroup] 2025-12-04T10:32:44.3173439Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T10:32:44.3173582Z with: 2025-12-04T10:32:44.3173686Z name: linux-jammy-rocm-py3.10 2025-12-04T10:32:44.3173810Z s3-bucket: gha-artifacts 2025-12-04T10:32:44.3173924Z region: us-east-1 2025-12-04T10:32:44.3174024Z env: 2025-12-04T10:32:44.3174124Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:32:44.3174289Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:32:44.3174470Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:32:44.3174642Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:32:44.3175143Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:32:44.3175629Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:32:44.3175745Z AWS_REGION: us-east-1 2025-12-04T10:32:44.3175880Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:32:44.3176041Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:32:44.3178018Z AWS_SESSION_TOKEN: *** 2025-12-04T10:32:44.3178123Z ##[endgroup] 2025-12-04T10:32:44.5456005Z (node:17062) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T10:32:44.5456220Z 2025-12-04T10:32:44.5456527Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T10:32:44.5456774Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T10:32:44.5457022Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T10:32:44.8121158Z Found 1 objects with prefix pytorch/pytorch/19922849170/linux-jammy-rocm-py3.10/ 2025-12-04T10:32:44.8121672Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T10:33:50.4971816Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip 2025-12-04T10:33:50.4974479Z Artifact download has finished successfully 2025-12-04T10:33:50.5264068Z ##[group]Run unzip -o artifacts.zip 2025-12-04T10:33:50.5264288Z unzip -o artifacts.zip 2025-12-04T10:33:50.5268719Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:33:50.5268886Z env: 2025-12-04T10:33:50.5269173Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:50.5269328Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:50.5269526Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:50.5269791Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:50.5270356Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:50.5270916Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:50.5271046Z AWS_REGION: us-east-1 2025-12-04T10:33:50.5271238Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:50.5271407Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:50.5273610Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:50.5273726Z ##[endgroup] 2025-12-04T10:33:50.5312488Z Archive: artifacts.zip 2025-12-04T10:33:50.5313092Z creating: dist/ 2025-12-04T10:33:50.5396336Z inflating: dist/.ninja_log 2025-12-04T10:33:53.4696228Z inflating: dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl 2025-12-04T10:33:53.4697309Z creating: build/ 2025-12-04T10:33:53.4697580Z creating: build/custom_test_artifacts/ 2025-12-04T10:33:53.4697969Z creating: build/custom_test_artifacts/custom-op-build/ 2025-12-04T10:33:53.4698425Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/ 2025-12-04T10:33:53.4698967Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/ 2025-12-04T10:33:53.4699773Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T10:33:53.4700363Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/ 2025-12-04T10:33:53.4700947Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T10:33:53.4701590Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T10:33:53.4702198Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T10:33:53.4702908Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T10:33:53.4703612Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T10:33:53.4704273Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T10:33:53.4704915Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T10:33:53.4705538Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T10:33:53.4706272Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T10:33:53.4707006Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T10:33:53.4707674Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T10:33:53.4708413Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T10:33:53.4709207Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T10:33:53.4709729Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/ 2025-12-04T10:33:53.4710089Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/ 2025-12-04T10:33:53.4710467Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache 2025-12-04T10:33:53.4710861Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/ 2025-12-04T10:33:53.4711462Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts 2025-12-04T10:33:53.4711949Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make 2025-12-04T10:33:53.4712418Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make 2025-12-04T10:33:53.4712860Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt 2025-12-04T10:33:53.4713315Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake 2025-12-04T10:33:53.4713773Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make 2025-12-04T10:33:53.4714226Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake 2025-12-04T10:33:53.4714679Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make 2025-12-04T10:33:53.4715134Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make 2025-12-04T10:33:53.4720477Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d 2025-12-04T10:33:53.4827650Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o 2025-12-04T10:33:53.4828091Z creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/ 2025-12-04T10:33:53.4828472Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts 2025-12-04T10:33:53.4828877Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make 2025-12-04T10:33:53.4829259Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make 2025-12-04T10:33:53.4829670Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt 2025-12-04T10:33:53.4830049Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake 2025-12-04T10:33:53.4830421Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make 2025-12-04T10:33:53.4830791Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake 2025-12-04T10:33:53.4831164Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make 2025-12-04T10:33:53.4831531Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make 2025-12-04T10:33:53.4841600Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d 2025-12-04T10:33:53.4885260Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o 2025-12-04T10:33:53.4885644Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T10:33:53.4885977Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt 2025-12-04T10:33:53.4886274Z extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks 2025-12-04T10:33:53.4886556Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2 2025-12-04T10:33:53.4886830Z inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake 2025-12-04T10:33:53.4887115Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_outer_vec.cc 2025-12-04T10:33:53.4887393Z inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_vec_ext.cc 2025-12-04T10:33:53.4888046Z inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt 2025-12-04T10:33:53.4888346Z inflating: build/custom_test_artifacts/custom-op-build/Makefile 2025-12-04T10:33:53.4888823Z inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake 2025-12-04T10:33:53.4980449Z inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so 2025-12-04T10:33:53.5009927Z inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops 2025-12-04T10:33:53.5010244Z creating: build/custom_test_artifacts/jit-hook-build/ 2025-12-04T10:33:53.5010529Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/ 2025-12-04T10:33:53.5010854Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/ 2025-12-04T10:33:53.5012623Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T10:33:53.5013012Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/ 2025-12-04T10:33:53.5013365Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T10:33:53.5013749Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T10:33:53.5014115Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T10:33:53.5014554Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T10:33:53.5014995Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T10:33:53.5015481Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T10:33:53.5015863Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T10:33:53.5016242Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T10:33:53.5016690Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T10:33:53.5017400Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T10:33:53.5017733Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T10:33:53.5018828Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T10:33:53.5019674Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T10:33:53.5019987Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/ 2025-12-04T10:33:53.5020228Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/ 2025-12-04T10:33:53.5020486Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache 2025-12-04T10:33:53.5020751Z creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/ 2025-12-04T10:33:53.5021048Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts 2025-12-04T10:33:53.5021390Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make 2025-12-04T10:33:53.5021715Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make 2025-12-04T10:33:53.5022014Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt 2025-12-04T10:33:53.5022331Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake 2025-12-04T10:33:53.5022652Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make 2025-12-04T10:33:53.5022964Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake 2025-12-04T10:33:53.5023279Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make 2025-12-04T10:33:53.5023594Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make 2025-12-04T10:33:53.5033716Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d 2025-12-04T10:33:53.5067511Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o 2025-12-04T10:33:53.5067830Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T10:33:53.5068122Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt 2025-12-04T10:33:53.5068378Z extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks 2025-12-04T10:33:53.5068622Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2 2025-12-04T10:33:53.5069182Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake 2025-12-04T10:33:53.5069440Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_outer_vec.cc 2025-12-04T10:33:53.5069702Z inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_vec_ext.cc 2025-12-04T10:33:53.5070545Z inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt 2025-12-04T10:33:53.5071000Z inflating: build/custom_test_artifacts/jit-hook-build/Makefile 2025-12-04T10:33:53.5071411Z inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake 2025-12-04T10:33:53.5091807Z inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks 2025-12-04T10:33:53.5092121Z creating: build/custom_test_artifacts/custom-backend-build/ 2025-12-04T10:33:53.5092432Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/ 2025-12-04T10:33:53.5092788Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/ 2025-12-04T10:33:53.5094390Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml 2025-12-04T10:33:53.5094795Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/ 2025-12-04T10:33:53.5095188Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeSystem.cmake 2025-12-04T10:33:53.5095622Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/ 2025-12-04T10:33:53.5096048Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/tmp/ 2025-12-04T10:33:53.5096541Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c 2025-12-04T10:33:53.5097031Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/a.out 2025-12-04T10:33:53.5097500Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake 2025-12-04T10:33:53.5097939Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/ 2025-12-04T10:33:53.5098357Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/ 2025-12-04T10:33:53.5098992Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp 2025-12-04T10:33:53.5099794Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out 2025-12-04T10:33:53.5100246Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake 2025-12-04T10:33:53.5101258Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin 2025-12-04T10:33:53.5101921Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin 2025-12-04T10:33:53.5102354Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/ 2025-12-04T10:33:53.5102705Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/ 2025-12-04T10:33:53.5103063Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache 2025-12-04T10:33:53.5103461Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/ 2025-12-04T10:33:53.5103963Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts 2025-12-04T10:33:53.5104443Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make 2025-12-04T10:33:53.5104912Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make 2025-12-04T10:33:53.5105336Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt 2025-12-04T10:33:53.5105783Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake 2025-12-04T10:33:53.5106243Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make 2025-12-04T10:33:53.5106696Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake 2025-12-04T10:33:53.5107140Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make 2025-12-04T10:33:53.5107577Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make 2025-12-04T10:33:53.5108111Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d 2025-12-04T10:33:53.5170645Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o 2025-12-04T10:33:53.5170984Z creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/ 2025-12-04T10:33:53.5171345Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts 2025-12-04T10:33:53.5171730Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make 2025-12-04T10:33:53.5172102Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make 2025-12-04T10:33:53.5172443Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt 2025-12-04T10:33:53.5172807Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake 2025-12-04T10:33:53.5173162Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make 2025-12-04T10:33:53.5173519Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake 2025-12-04T10:33:53.5173869Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make 2025-12-04T10:33:53.5174216Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make 2025-12-04T10:33:53.5184569Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d 2025-12-04T10:33:53.5213947Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o 2025-12-04T10:33:53.5214311Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake 2025-12-04T10:33:53.5214657Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt 2025-12-04T10:33:53.5214942Z extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks 2025-12-04T10:33:53.5215271Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2 2025-12-04T10:33:53.5216033Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake 2025-12-04T10:33:53.5216342Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_outer_vec.cc 2025-12-04T10:33:53.5216626Z inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_vec_ext.cc 2025-12-04T10:33:53.5217333Z inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt 2025-12-04T10:33:53.5217726Z inflating: build/custom_test_artifacts/custom-backend-build/Makefile 2025-12-04T10:33:53.5217965Z inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake 2025-12-04T10:33:53.5272014Z inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so 2025-12-04T10:33:53.5292719Z inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend 2025-12-04T10:33:53.5292916Z creating: build/lib/ 2025-12-04T10:33:53.5338338Z inflating: build/lib/libprotobuf-lite.a 2025-12-04T10:33:53.5582014Z inflating: build/lib/libprotobuf.a 2025-12-04T10:33:53.5855289Z inflating: build/lib/libprotoc.a 2025-12-04T10:33:53.5860780Z inflating: build/lib/libpthreadpool.a 2025-12-04T10:33:53.5864879Z inflating: build/lib/libcpuinfo.a 2025-12-04T10:33:53.5868779Z inflating: build/lib/libcpuinfo_internals.a 2025-12-04T10:33:53.5869274Z inflating: build/lib/libclog.a 2025-12-04T10:33:53.5879508Z inflating: build/lib/libpytorch_qnnpack.a 2025-12-04T10:33:53.5880622Z inflating: build/lib/libnnpack_reference_layers.a 2025-12-04T10:33:53.5890206Z inflating: build/lib/libnnpack.a 2025-12-04T10:33:53.5991335Z inflating: build/lib/libmicrokernels-prod.a 2025-12-04T10:33:53.6459968Z inflating: build/lib/libmicrokernels-all.a 2025-12-04T10:33:53.6497686Z inflating: build/lib/libgtest.a 2025-12-04T10:33:53.6506959Z inflating: build/lib/libgmock.a 2025-12-04T10:33:53.6507186Z inflating: build/lib/libgtest_main.a 2025-12-04T10:33:53.6507390Z inflating: build/lib/libgmock_main.a 2025-12-04T10:33:53.6556862Z inflating: build/lib/libXNNPACK.a 2025-12-04T10:33:53.6598105Z inflating: build/lib/libbenchmark.a 2025-12-04T10:33:53.6598433Z inflating: build/lib/libbenchmark_main.a 2025-12-04T10:33:53.6598667Z inflating: build/lib/libjitprofiling.a 2025-12-04T10:33:53.6603102Z inflating: build/lib/libittnotify.a 2025-12-04T10:33:53.6639169Z inflating: build/lib/libasmjit.a 2025-12-04T10:33:53.7262239Z inflating: build/lib/libfbgemm.a 2025-12-04T10:33:53.7278839Z inflating: build/lib/libtensorpipe_uv.a 2025-12-04T10:33:53.7575491Z inflating: build/lib/libtensorpipe.a 2025-12-04T10:33:53.7641757Z inflating: build/lib/libgloo.a 2025-12-04T10:33:53.7667329Z inflating: build/lib/libonnx_proto.a 2025-12-04T10:33:53.7889124Z inflating: build/lib/libgloo_hip.a 2025-12-04T10:33:53.8282647Z inflating: build/lib/libonnx.a 2025-12-04T10:33:54.3817113Z inflating: build/lib/libdnnl.a 2025-12-04T10:33:54.3827048Z inflating: build/lib/libfmt.a 2025-12-04T10:33:54.3996672Z inflating: build/lib/libkineto.a 2025-12-04T10:33:54.4060994Z inflating: build/lib/libc10.so 2025-12-04T10:33:54.4061374Z inflating: build/lib/libtorch_global_deps.so 2025-12-04T10:33:54.4062529Z inflating: build/lib/libcaffe2_nvrtc.so 2025-12-04T10:33:54.4087615Z inflating: build/lib/libc10_hip.so 2025-12-04T10:33:54.4360610Z inflating: build/lib/libfbgemm_genai.a 2025-12-04T10:33:56.1291342Z inflating: build/lib/libtorch_cpu.so 2025-12-04T10:33:56.1292983Z inflating: build/lib/libshm.so 2025-12-04T10:33:56.9578713Z inflating: build/lib/libtorch_hip.so 2025-12-04T10:33:56.9579250Z inflating: build/lib/libtorch.so 2025-12-04T10:33:56.9589542Z inflating: build/lib/libjitbackend_test.so 2025-12-04T10:33:56.9602747Z inflating: build/lib/libbackend_with_compiler.so 2025-12-04T10:33:56.9642322Z inflating: build/lib/libtorchbind_test.so 2025-12-04T10:33:56.9656992Z inflating: build/lib/libaoti_custom_ops.so 2025-12-04T10:33:57.0947744Z inflating: build/lib/libtorch_python.so 2025-12-04T10:33:57.0967707Z inflating: build/lib/libnnapi_backend.so 2025-12-04T10:33:57.0968031Z creating: build/bin/ 2025-12-04T10:33:57.0968284Z creating: build/bin/CMakeFiles/ 2025-12-04T10:33:57.0969105Z inflating: build/bin/cmake_install.cmake 2025-12-04T10:33:57.0969401Z inflating: build/bin/CTestTestfile.cmake 2025-12-04T10:33:57.1221129Z inflating: build/bin/protoc-3.13.0.0 2025-12-04T10:33:57.1473255Z inflating: build/bin/protoc 2025-12-04T10:33:57.1505998Z inflating: build/bin/c10_AllocatorConfig_test 2025-12-04T10:33:57.1536809Z inflating: build/bin/c10_CompileTimeFunctionPointer_test 2025-12-04T10:33:57.1568292Z inflating: build/bin/c10_DeviceGuard_test 2025-12-04T10:33:57.1599979Z inflating: build/bin/c10_Device_test 2025-12-04T10:33:57.1636037Z inflating: build/bin/c10_DispatchKeySet_test 2025-12-04T10:33:57.1668533Z inflating: build/bin/c10_Scalar_test 2025-12-04T10:33:57.1698649Z inflating: build/bin/c10_StreamGuard_test 2025-12-04T10:33:57.1732977Z inflating: build/bin/c10_SymInt_test 2025-12-04T10:33:57.1767352Z inflating: build/bin/c10_SizesAndStrides_test 2025-12-04T10:33:57.1799817Z inflating: build/bin/c10_Bitset_test 2025-12-04T10:33:57.1841759Z inflating: build/bin/c10_cow_test 2025-12-04T10:33:57.1874845Z inflating: build/bin/c10_InlineDeviceGuard_test 2025-12-04T10:33:57.1908915Z inflating: build/bin/c10_InlineStreamGuard_test 2025-12-04T10:33:57.1939252Z inflating: build/bin/c10_ArrayRef_test 2025-12-04T10:33:57.1969396Z inflating: build/bin/c10_ConstexprCrc_test 2025-12-04T10:33:57.1999963Z inflating: build/bin/c10_DeadlockDetection_test 2025-12-04T10:33:57.2032867Z inflating: build/bin/c10_IntrusiveList_test 2025-12-04T10:33:57.2063985Z inflating: build/bin/c10_Half_test 2025-12-04T10:33:57.2099299Z inflating: build/bin/c10_Enumerate_test 2025-12-04T10:33:57.2133837Z inflating: build/bin/c10_LeftRight_test 2025-12-04T10:33:57.2165940Z inflating: build/bin/c10_NetworkFlow_test 2025-12-04T10:33:57.2196325Z inflating: build/bin/c10_Semaphore_test 2025-12-04T10:33:57.2227122Z inflating: build/bin/c10_Synchronized_test 2025-12-04T10:33:57.2258966Z inflating: build/bin/c10_TypeIndex_test 2025-12-04T10:33:57.2292713Z inflating: build/bin/c10_ThreadLocal_test 2025-12-04T10:33:57.2324366Z inflating: build/bin/c10_accumulate_test 2025-12-04T10:33:57.2358464Z inflating: build/bin/c10_bfloat16_test 2025-12-04T10:33:57.2388896Z inflating: build/bin/c10_error_test 2025-12-04T10:33:57.2419897Z inflating: build/bin/c10_bit_cast_test 2025-12-04T10:33:57.2453545Z inflating: build/bin/c10_complex_test 2025-12-04T10:33:57.2485663Z inflating: build/bin/c10_exception_test 2025-12-04T10:33:57.2520202Z inflating: build/bin/c10_complex_math_test 2025-12-04T10:33:57.2551193Z inflating: build/bin/c10_flags_test 2025-12-04T10:33:57.2582464Z inflating: build/bin/c10_irange_test 2025-12-04T10:33:57.2613432Z inflating: build/bin/c10_generic_math_test 2025-12-04T10:33:57.2702893Z inflating: build/bin/c10_intrusive_ptr_test 2025-12-04T10:33:57.2737565Z inflating: build/bin/c10_logging_test 2025-12-04T10:33:57.2768225Z inflating: build/bin/c10_nofatal_test 2025-12-04T10:33:57.2800858Z inflating: build/bin/c10_lazy_test 2025-12-04T10:33:57.2838489Z inflating: build/bin/c10_ordered_preserving_dict_test 2025-12-04T10:33:57.2871038Z inflating: build/bin/c10_registry_test 2025-12-04T10:33:57.2902581Z inflating: build/bin/c10_ssize_test 2025-12-04T10:33:57.2947200Z inflating: build/bin/c10_optional_test 2025-12-04T10:33:57.3034639Z inflating: build/bin/c10_small_vector_test 2025-12-04T10:33:57.3069013Z inflating: build/bin/c10_string_util_test 2025-12-04T10:33:57.3099674Z inflating: build/bin/c10_tempfile_test 2025-12-04T10:33:57.3129763Z inflating: build/bin/c10_string_view_test 2025-12-04T10:33:57.3156649Z inflating: build/bin/c10_intrusive_ptr_benchmark 2025-12-04T10:33:57.3190637Z inflating: build/bin/c10_typeid_test 2025-12-04T10:33:57.3221086Z inflating: build/bin/c10_hip_HIPAssertionsTest_1_var_test 2025-12-04T10:33:57.3251091Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_stream 2025-12-04T10:33:57.3281748Z inflating: build/bin/c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device 2025-12-04T10:33:57.3311165Z inflating: build/bin/c10_hip_HIPAssertionsTest_from_2_processes 2025-12-04T10:33:57.3341213Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads 2025-12-04T10:33:57.3371310Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks 2025-12-04T10:33:57.3401400Z inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_same_block 2025-12-04T10:33:57.3431639Z inflating: build/bin/c10_hip_HIPTest 2025-12-04T10:33:57.3758504Z inflating: build/bin/vec_test_all_types_DEFAULT 2025-12-04T10:33:57.4092818Z inflating: build/bin/vec_test_all_types_AVX512 2025-12-04T10:33:57.4433885Z inflating: build/bin/vec_test_all_types_AVX2 2025-12-04T10:33:57.4491257Z inflating: build/bin/test_aoti_abi_check 2025-12-04T10:33:57.4521961Z inflating: build/bin/test_vec_half_DEFAULT 2025-12-04T10:33:57.4552359Z inflating: build/bin/test_vec_half_AVX2 2025-12-04T10:33:57.4582939Z inflating: build/bin/test_vec_half_AVX512 2025-12-04T10:33:57.4615038Z inflating: build/bin/BackoffTest 2025-12-04T10:33:57.4647418Z inflating: build/bin/FileStoreTest 2025-12-04T10:33:57.4682026Z inflating: build/bin/TCPStoreTest 2025-12-04T10:33:57.4714760Z inflating: build/bin/HashStoreTest 2025-12-04T10:33:57.4755081Z inflating: build/bin/ProcessGroupGlooTest 2025-12-04T10:33:57.4756627Z inflating: build/bin/example_allreduce 2025-12-04T10:33:57.4758663Z inflating: build/bin/torch_shm_manager 2025-12-04T10:33:57.4791674Z inflating: build/bin/static_runtime_bench 2025-12-04T10:33:57.4935148Z inflating: build/bin/static_runtime_test 2025-12-04T10:33:57.4978641Z inflating: build/bin/Dict_test 2025-12-04T10:33:57.5010888Z inflating: build/bin/Dimname_test 2025-12-04T10:33:57.5049863Z inflating: build/bin/MaybeOwned_test 2025-12-04T10:33:57.5084548Z inflating: build/bin/NamedTensor_test 2025-12-04T10:33:57.5120338Z inflating: build/bin/apply_utils_test 2025-12-04T10:33:57.5156106Z inflating: build/bin/atest 2025-12-04T10:33:57.5194654Z inflating: build/bin/basic 2025-12-04T10:33:57.5228107Z inflating: build/bin/broadcast_test 2025-12-04T10:33:57.5259246Z inflating: build/bin/cpu_allocator_test 2025-12-04T10:33:57.5294604Z inflating: build/bin/cpu_generator_test 2025-12-04T10:33:57.5326680Z inflating: build/bin/cpu_profiling_allocator_test 2025-12-04T10:33:57.5381648Z inflating: build/bin/cpu_rng_test 2025-12-04T10:33:57.5413358Z inflating: build/bin/dlconvertor_test 2025-12-04T10:33:57.5448268Z inflating: build/bin/extension_backend_test 2025-12-04T10:33:57.5482191Z inflating: build/bin/half_test 2025-12-04T10:33:57.5539509Z inflating: build/bin/ivalue_test 2025-12-04T10:33:57.5570033Z inflating: build/bin/lazy_tensor_test 2025-12-04T10:33:57.5602241Z inflating: build/bin/math_kernel_test 2025-12-04T10:33:57.5634420Z inflating: build/bin/memory_format_test 2025-12-04T10:33:57.5667148Z inflating: build/bin/memory_overlapping_test 2025-12-04T10:33:57.5699536Z inflating: build/bin/mobile_memory_cleanup 2025-12-04T10:33:57.5733511Z inflating: build/bin/native_test 2025-12-04T10:33:57.5764850Z inflating: build/bin/operator_name_test 2025-12-04T10:33:57.5795842Z inflating: build/bin/operators_test 2025-12-04T10:33:57.5827621Z inflating: build/bin/packedtensoraccessor_test 2025-12-04T10:33:57.5868153Z inflating: build/bin/pow_test 2025-12-04T10:33:57.5902685Z inflating: build/bin/quantized_test 2025-12-04T10:33:57.5933275Z inflating: build/bin/reduce_ops_test 2025-12-04T10:33:57.5964451Z inflating: build/bin/reportMemoryUsage_test 2025-12-04T10:33:57.5998352Z inflating: build/bin/scalar_tensor_test 2025-12-04T10:33:57.6033231Z inflating: build/bin/scalar_test 2025-12-04T10:33:57.6064755Z inflating: build/bin/StorageUtils_test 2025-12-04T10:33:57.6096603Z inflating: build/bin/stride_properties_test 2025-12-04T10:33:57.6143718Z inflating: build/bin/tensor_iterator_test 2025-12-04T10:33:57.6176617Z inflating: build/bin/test_parallel 2025-12-04T10:33:57.6208245Z inflating: build/bin/thread_init_test 2025-12-04T10:33:57.6241640Z inflating: build/bin/type_ptr_test 2025-12-04T10:33:57.6277531Z inflating: build/bin/type_test 2025-12-04T10:33:57.6309450Z inflating: build/bin/undefined_tensor_test 2025-12-04T10:33:57.6339860Z inflating: build/bin/verify_api_visibility 2025-12-04T10:33:57.6382356Z inflating: build/bin/legacy_vmap_test 2025-12-04T10:33:57.6413805Z inflating: build/bin/weakref_test 2025-12-04T10:33:57.6445293Z inflating: build/bin/wrapdim_test 2025-12-04T10:33:57.6506816Z inflating: build/bin/List_test 2025-12-04T10:33:57.6538175Z inflating: build/bin/xla_tensor_test 2025-12-04T10:33:57.6574036Z inflating: build/bin/IListRef_test 2025-12-04T10:33:57.6643482Z inflating: build/bin/kernel_function_legacy_test 2025-12-04T10:33:57.6683177Z inflating: build/bin/KernelFunction_test 2025-12-04T10:33:57.6739833Z inflating: build/bin/kernel_function_test 2025-12-04T10:33:57.6813173Z inflating: build/bin/kernel_lambda_legacy_test 2025-12-04T10:33:57.6873091Z inflating: build/bin/kernel_lambda_test 2025-12-04T10:33:57.6909397Z inflating: build/bin/kernel_stackbased_test 2025-12-04T10:33:57.6965592Z inflating: build/bin/make_boxed_from_unboxed_functor_test 2025-12-04T10:33:57.6996874Z inflating: build/bin/CppSignature_test 2025-12-04T10:33:57.7027052Z inflating: build/bin/op_allowlist_test 2025-12-04T10:33:57.7204215Z inflating: build/bin/op_registration_test 2025-12-04T10:33:57.7233796Z inflating: build/bin/hip_complex_math_test 2025-12-04T10:33:57.7267271Z inflating: build/bin/backend_fallback_test 2025-12-04T10:33:57.7297674Z inflating: build/bin/hip_complex_test 2025-12-04T10:33:57.7337919Z inflating: build/bin/inline_container_test 2025-12-04T10:33:57.7370168Z inflating: build/bin/hip_apply_test 2025-12-04T10:33:57.7400402Z inflating: build/bin/hip_distributions_test 2025-12-04T10:33:57.7430458Z inflating: build/bin/hip_generator_test 2025-12-04T10:33:57.7460438Z inflating: build/bin/hip_half_test 2025-12-04T10:33:57.7490521Z inflating: build/bin/hip_integer_divider_test 2025-12-04T10:33:57.7520576Z inflating: build/bin/hip_optional_test 2025-12-04T10:33:57.7550638Z inflating: build/bin/hip_packedtensoraccessor_test 2025-12-04T10:33:57.7580711Z inflating: build/bin/hip_vectorized_test 2025-12-04T10:33:57.7612325Z inflating: build/bin/hip_dlconvertor_test 2025-12-04T10:33:57.8231200Z inflating: build/bin/test_jit 2025-12-04T10:33:57.8429443Z inflating: build/bin/test_lazy 2025-12-04T10:33:57.8463803Z inflating: build/bin/test_dist_autograd 2025-12-04T10:33:57.8504968Z inflating: build/bin/test_cpp_rpc 2025-12-04T10:33:57.8505982Z inflating: build/bin/parallel_benchmark 2025-12-04T10:33:57.9162306Z inflating: build/bin/test_api 2025-12-04T10:33:57.9162717Z creating: .additional_ci_files/ 2025-12-04T10:33:57.9198259Z inflating: .additional_ci_files/test-times.json 2025-12-04T10:33:57.9329827Z inflating: .additional_ci_files/test-class-times.json 2025-12-04T10:33:57.9355998Z ##[group]Run rm artifacts.zip 2025-12-04T10:33:57.9356195Z rm artifacts.zip 2025-12-04T10:33:57.9361187Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:33:57.9361401Z env: 2025-12-04T10:33:57.9361538Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:57.9361721Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:57.9361958Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:57.9362185Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:57.9363052Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:57.9363706Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:57.9363927Z AWS_REGION: us-east-1 2025-12-04T10:33:57.9364123Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:57.9364332Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:57.9366437Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:57.9366561Z ##[endgroup] 2025-12-04T10:33:58.0301591Z ##[group]Run df -H 2025-12-04T10:33:58.0301781Z df -H 2025-12-04T10:33:58.0307361Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:33:58.0307562Z env: 2025-12-04T10:33:58.0307695Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:58.0307881Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:58.0308117Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:58.0308354Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:58.0309018Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:58.0309887Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:58.0310011Z AWS_REGION: us-east-1 2025-12-04T10:33:58.0310224Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:58.0310388Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:58.0312546Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:58.0312660Z ##[endgroup] 2025-12-04T10:33:58.0668503Z Filesystem Size Used Avail Use% Mounted on 2025-12-04T10:33:58.0669004Z overlay 16T 460G 15T 3% / 2025-12-04T10:33:58.0669366Z tmpfs 68M 0 68M 0% /dev 2025-12-04T10:33:58.0669870Z /dev/md0 16T 460G 15T 3% /run 2025-12-04T10:33:58.0670228Z shm 68M 17k 68M 1% /dev/shm 2025-12-04T10:33:58.0670822Z amdprj2-k8s_2 5.5T 120G 5.4T 3% /home/runner/pytorch-data 2025-12-04T10:33:58.0671480Z tmpfs 3.3T 13k 3.3T 1% /run/secrets/kubernetes.io/serviceaccount 2025-12-04T10:33:58.0671969Z tmpfs 1.7T 0 1.7T 0% /proc/acpi 2025-12-04T10:33:58.0672343Z tmpfs 1.7T 0 1.7T 0% /proc/scsi 2025-12-04T10:33:58.0672727Z tmpfs 1.7T 0 1.7T 0% /sys/firmware 2025-12-04T10:33:58.0673151Z tmpfs 1.7T 0 1.7T 0% /sys/devices/virtual/powercap 2025-12-04T10:33:58.0701812Z Prepare all required actions 2025-12-04T10:33:58.0702034Z Getting action download info 2025-12-04T10:33:58.4392831Z ##[group]Run ./.github/actions/download-td-artifacts 2025-12-04T10:33:58.4392989Z with: 2025-12-04T10:33:58.4393079Z env: 2025-12-04T10:33:58.4393181Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:58.4393326Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:58.4393512Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:58.4393677Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:58.4394194Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:58.4394702Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:58.4394849Z AWS_REGION: us-east-1 2025-12-04T10:33:58.4395045Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:58.4395207Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:58.4397241Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:58.4397350Z ##[endgroup] 2025-12-04T10:33:58.4410826Z ##[group]Run seemethere/download-artifact-s3@v4 2025-12-04T10:33:58.4410972Z with: 2025-12-04T10:33:58.4411073Z name: td_results 2025-12-04T10:33:58.4411182Z s3-bucket: gha-artifacts 2025-12-04T10:33:58.4411300Z region: us-east-1 2025-12-04T10:33:58.4411404Z env: 2025-12-04T10:33:58.4411502Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:58.4411646Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:58.4411836Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:58.4412014Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:58.4412527Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:58.4413021Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:58.4413147Z AWS_REGION: us-east-1 2025-12-04T10:33:58.4413288Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:58.4413449Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:58.4415431Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:58.4415545Z ##[endgroup] 2025-12-04T10:33:58.6742369Z (node:17098) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023. 2025-12-04T10:33:58.6742658Z 2025-12-04T10:33:58.6743008Z Please migrate your code to use AWS SDK for JavaScript (v3). 2025-12-04T10:33:58.6743356Z For more information, check the migration guide at https://a.co/7PzMCcy 2025-12-04T10:33:58.6743710Z (Use `node --trace-warnings ...` to show where the warning was created) 2025-12-04T10:33:58.9650074Z Found 1 objects with prefix pytorch/pytorch/19922849170/td_results/ 2025-12-04T10:33:58.9650484Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json 2025-12-04T10:33:59.4216374Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json 2025-12-04T10:33:59.4220168Z Artifact download has finished successfully 2025-12-04T10:33:59.4411089Z ##[group]Run mkdir -p .additional_ci_files 2025-12-04T10:33:59.4411273Z mkdir -p .additional_ci_files 2025-12-04T10:33:59.4411453Z mv td_results.json .additional_ci_files/td_results.json || true 2025-12-04T10:33:59.4416014Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:33:59.4416172Z env: 2025-12-04T10:33:59.4416298Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:59.4416445Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:59.4416628Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:59.4416801Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:59.4417487Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:59.4417983Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:59.4418107Z AWS_REGION: us-east-1 2025-12-04T10:33:59.4418371Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:59.4418543Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:59.4420593Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:59.4420706Z ##[endgroup] 2025-12-04T10:33:59.4473369Z ##[group]Run .github/scripts/parse_ref.py 2025-12-04T10:33:59.4473527Z .github/scripts/parse_ref.py 2025-12-04T10:33:59.4476028Z shell: /usr/bin/bash -e {0} 2025-12-04T10:33:59.4476143Z env: 2025-12-04T10:33:59.4476241Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:59.4476381Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:59.4476563Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:59.4476733Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:59.4477248Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:59.4477744Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:59.4477864Z AWS_REGION: us-east-1 2025-12-04T10:33:59.4478016Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:59.4478178Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:59.4480285Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:59.4480394Z ##[endgroup] 2025-12-04T10:33:59.4575949Z Setting output branch=main 2025-12-04T10:33:59.4644951Z Prepare all required actions 2025-12-04T10:33:59.4645210Z Getting action download info 2025-12-04T10:33:59.6895302Z ##[group]Run ./.github/actions/filter-test-configs 2025-12-04T10:33:59.6895451Z with: 2025-12-04T10:33:59.6895730Z github-token: *** 2025-12-04T10:33:59.6898708Z test-matrix: {"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T10:33:59.6902117Z job-name: linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:33:59.6902334Z env: 2025-12-04T10:33:59.6902427Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:59.6902565Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:59.6902741Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:59.6902906Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:59.6903417Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:59.6903909Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:59.6904166Z AWS_REGION: us-east-1 2025-12-04T10:33:59.6904291Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:59.6904443Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:59.6906426Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:59.6906528Z ##[endgroup] 2025-12-04T10:33:59.6935174Z ##[group]Run nick-fields/retry@v3.0.0 2025-12-04T10:33:59.6935297Z with: 2025-12-04T10:33:59.6935380Z shell: bash 2025-12-04T10:33:59.6935473Z timeout_minutes: 10 2025-12-04T10:33:59.6935571Z max_attempts: 5 2025-12-04T10:33:59.6935667Z retry_wait_seconds: 30 2025-12-04T10:33:59.6935959Z command: set -eux # PyYAML 6.0 doesn't work with MacOS x86 anymore # This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2 python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T10:33:59.6936257Z polling_interval_seconds: 1 2025-12-04T10:33:59.6936438Z warning_on_retry: true 2025-12-04T10:33:59.6936541Z continue_on_error: false 2025-12-04T10:33:59.6936642Z env: 2025-12-04T10:33:59.6936732Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:33:59.6936863Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:33:59.6937038Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:33:59.6937198Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:33:59.6937702Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:33:59.6938187Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:33:59.6938296Z AWS_REGION: us-east-1 2025-12-04T10:33:59.6938424Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:33:59.6938578Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:33:59.6940625Z AWS_SESSION_TOKEN: *** 2025-12-04T10:33:59.6940788Z GITHUB_TOKEN: *** 2025-12-04T10:33:59.6940883Z ##[endgroup] 2025-12-04T10:33:59.7340498Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2 2025-12-04T10:33:59.8743590Z Defaulting to user installation because normal site-packages is not writeable 2025-12-04T10:33:59.9638570Z Collecting requests==2.27.1 2025-12-04T10:34:00.0140354Z Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB) 2025-12-04T10:34:00.0238199Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.1/63.1 KB 6.3 MB/s eta 0:00:00 2025-12-04T10:34:00.0671724Z Collecting pyyaml==6.0.2 2025-12-04T10:34:00.0724602Z Downloading PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB) 2025-12-04T10:34:00.1155097Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 751.2/751.2 KB 18.8 MB/s eta 0:00:00 2025-12-04T10:34:00.1333483Z Collecting idna<4,>=2.5 2025-12-04T10:34:00.1386713Z Downloading idna-3.11-py3-none-any.whl (71 kB) 2025-12-04T10:34:00.1406152Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.0/71.0 KB 57.7 MB/s eta 0:00:00 2025-12-04T10:34:00.1579080Z Collecting certifi>=2017.4.17 2025-12-04T10:34:00.1632394Z Downloading certifi-2025.11.12-py3-none-any.whl (159 kB) 2025-12-04T10:34:00.1692411Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 159.4/159.4 KB 29.5 MB/s eta 0:00:00 2025-12-04T10:34:00.1954372Z Collecting urllib3<1.27,>=1.21.1 2025-12-04T10:34:00.2010893Z Downloading urllib3-1.26.20-py2.py3-none-any.whl (144 kB) 2025-12-04T10:34:00.2067231Z ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.2/144.2 KB 28.8 MB/s eta 0:00:00 2025-12-04T10:34:00.2969181Z Collecting charset-normalizer~=2.0.0 2025-12-04T10:34:00.3024865Z Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB) 2025-12-04T10:34:00.3550486Z Installing collected packages: urllib3, pyyaml, idna, charset-normalizer, certifi, requests 2025-12-04T10:34:00.4475796Z WARNING: The script normalizer is installed in '/home/runner/.local/bin' which is not on PATH. 2025-12-04T10:34:00.4476734Z Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. 2025-12-04T10:34:00.4640930Z Successfully installed certifi-2025.11.12 charset-normalizer-2.0.12 idna-3.11 pyyaml-6.0.2 requests-2.27.1 urllib3-1.26.20 2025-12-04T10:34:00.7344594Z Command completed after 1 attempt(s). 2025-12-04T10:34:00.7390808Z ##[group]Run set -x 2025-12-04T10:34:00.7390977Z set -x 2025-12-04T10:34:00.7391106Z  2025-12-04T10:34:00.7391299Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T10:34:00.7391529Z # in runner workspace 2025-12-04T10:34:00.7391726Z python3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py" 2025-12-04T10:34:00.7396761Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:34:00.7396952Z env: 2025-12-04T10:34:00.7397071Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:34:00.7397253Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:34:00.7397651Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:34:00.7397880Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:34:00.7398399Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:34:00.7398903Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:34:00.7399025Z AWS_REGION: us-east-1 2025-12-04T10:34:00.7399205Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:34:00.7399368Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:34:00.7401389Z AWS_SESSION_TOKEN: *** 2025-12-04T10:34:00.7401498Z ##[endgroup] 2025-12-04T10:34:00.7421954Z + python3 /home/runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py 2025-12-04T10:34:00.7510528Z Setting output branch=main 2025-12-04T10:34:00.7547037Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T10:34:00.7547286Z echo "Workflow: ${GITHUB_WORKFLOW}" 2025-12-04T10:34:00.7547457Z echo "Job name: ${JOB_NAME}" 2025-12-04T10:34:00.7547611Z  2025-12-04T10:34:00.7547807Z # Use relative path here as this could be checked out anywhere, not necessarily 2025-12-04T10:34:00.7548046Z # in runner workspace 2025-12-04T10:34:00.7548266Z python3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \ 2025-12-04T10:34:00.7548499Z  --workflow "${GITHUB_WORKFLOW}" \ 2025-12-04T10:34:00.7548673Z  --job-name "${JOB_NAME}" \ 2025-12-04T10:34:00.7552772Z  --test-matrix "{"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]}" \ 2025-12-04T10:34:00.7556367Z  --selected-test-configs "" \ 2025-12-04T10:34:00.7556503Z  --pr-number "${PR_NUMBER}" \ 2025-12-04T10:34:00.7556634Z  --tag "${TAG}" \ 2025-12-04T10:34:00.7556757Z  --event-name "${EVENT_NAME}" \ 2025-12-04T10:34:00.7556887Z  --schedule "${SCHEDULE}" \ 2025-12-04T10:34:00.7557012Z  --branch "${HEAD_BRANCH}" 2025-12-04T10:34:00.7561296Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:34:00.7561450Z env: 2025-12-04T10:34:00.7561546Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:34:00.7561688Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:34:00.7561880Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:34:00.7562050Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:34:00.7562570Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:34:00.7563097Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:34:00.7563215Z AWS_REGION: us-east-1 2025-12-04T10:34:00.7563383Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:34:00.7563539Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:34:00.7565519Z AWS_SESSION_TOKEN: *** 2025-12-04T10:34:00.7565723Z GITHUB_TOKEN: *** 2025-12-04T10:34:00.7565924Z JOB_NAME: linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:34:00.7566132Z PR_NUMBER: 2025-12-04T10:34:00.7566221Z TAG: 2025-12-04T10:34:00.7566305Z EVENT_NAME: schedule 2025-12-04T10:34:00.7566404Z SCHEDULE: 29 8 * * * 2025-12-04T10:34:00.7566501Z HEAD_BRANCH: main 2025-12-04T10:34:00.7566597Z ##[endgroup] 2025-12-04T10:34:00.7586594Z Workflow: trunk-rocm-mi300 2025-12-04T10:34:00.7586813Z Job name: linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:34:01.3420283Z INFO:root:Issue https://github.com/pytorch/pytorch/issues/167616 created by jithunnair-amd has unstable all the test jobs for trunk-rocm-mi300 / linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:34:01.3649753Z Setting output keep-going=True 2025-12-04T10:34:01.3650068Z Setting output ci-verbose-test-logs=False 2025-12-04T10:34:01.3650321Z Setting output ci-test-showlocals=False 2025-12-04T10:34:01.3650809Z Setting output ci-no-test-timeout=False 2025-12-04T10:34:01.3651018Z Setting output ci-no-td=False 2025-12-04T10:34:01.3651218Z Setting output ci-td-distributed=False 2025-12-04T10:34:01.3651422Z Setting output is-unstable=True 2025-12-04T10:34:01.3651625Z Setting output reenabled-issues= 2025-12-04T10:34:01.3662068Z Setting output test-matrix={"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]} 2025-12-04T10:34:01.3671008Z Setting output is-test-matrix-empty=False 2025-12-04T10:34:01.3758874Z ##[group]Run echo "Filtered matrix:" 2025-12-04T10:34:01.3759130Z echo "Filtered matrix:" 2025-12-04T10:34:01.3768002Z echo "{"include": [{"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 1, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 2, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 3, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 4, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 5, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "default", "shard": 6, "num_shards": 6, "runner": "linux.rocm.gpu.gfx942.1.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "mem_leak_check": "mem_leak_check", "unstable": "unstable", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "rerun_disabled_tests": "rerun_disabled_tests", "unstable": "unstable"}]}" 2025-12-04T10:34:01.3775200Z  2025-12-04T10:34:01.3775293Z echo 2025-12-04T10:34:01.3775413Z echo "Is the current job unstable? True" 2025-12-04T10:34:01.3775551Z  2025-12-04T10:34:01.3775633Z echo 2025-12-04T10:34:01.3775743Z echo "Is keep-going label set? True" 2025-12-04T10:34:01.3775872Z  2025-12-04T10:34:01.3775956Z echo 2025-12-04T10:34:01.3776059Z echo "Reenabled issues? " 2025-12-04T10:34:01.3780507Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:34:01.3780661Z env: 2025-12-04T10:34:01.3780764Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:34:01.3780903Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:34:01.3781091Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:34:01.3781260Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:34:01.3781761Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:34:01.3782248Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:34:01.3782373Z AWS_REGION: us-east-1 2025-12-04T10:34:01.3782556Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:34:01.3782781Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:34:01.3784753Z AWS_SESSION_TOKEN: *** 2025-12-04T10:34:01.3784863Z ##[endgroup] 2025-12-04T10:34:01.3804818Z Filtered matrix: 2025-12-04T10:34:01.3818477Z {include: [{config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 1, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 2, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 3, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 4, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 5, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: default, shard: 6, num_shards: 6, runner: linux.rocm.gpu.gfx942.1.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, mem_leak_check: mem_leak_check, unstable: unstable, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable, mem_leak_check: mem_leak_check}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, rerun_disabled_tests: rerun_disabled_tests, unstable: unstable}]} 2025-12-04T10:34:01.3827365Z 2025-12-04T10:34:01.3827435Z Is the current job unstable? True 2025-12-04T10:34:01.3827544Z 2025-12-04T10:34:01.3827603Z Is keep-going label set? True 2025-12-04T10:34:01.3827708Z 2025-12-04T10:34:01.3827761Z Reenabled issues? 2025-12-04T10:34:01.3854814Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T10:34:01.3855052Z echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}" 2025-12-04T10:34:01.3860087Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:34:01.3860242Z env: 2025-12-04T10:34:01.3860343Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:34:01.3860486Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:34:01.3860671Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:34:01.3860842Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:34:01.3861356Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:34:01.3861873Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:34:01.3861995Z AWS_REGION: us-east-1 2025-12-04T10:34:01.3862175Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:34:01.3862333Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:34:01.3864361Z AWS_SESSION_TOKEN: *** 2025-12-04T10:34:01.3864471Z JOB_TIMEOUT: 600 2025-12-04T10:34:01.3864575Z ##[endgroup] 2025-12-04T10:34:01.3912538Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:34:01.3912817Z env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:34:01.3913057Z env | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}" 2025-12-04T10:34:01.3917659Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T10:34:01.3917857Z env: 2025-12-04T10:34:01.3917988Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:34:01.3918174Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:34:01.3918411Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:34:01.3918630Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:34:01.3919261Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:34:01.3919838Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:34:01.3919966Z AWS_REGION: us-east-1 2025-12-04T10:34:01.3920145Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:34:01.3920310Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:34:01.3922358Z AWS_SESSION_TOKEN: *** 2025-12-04T10:34:01.3922475Z ##[endgroup] 2025-12-04T10:34:01.4001551Z ##[group]Run set -x 2025-12-04T10:34:01.4001705Z set -x 2025-12-04T10:34:01.4001806Z  2025-12-04T10:34:01.4001922Z if [[ $TEST_CONFIG == 'multigpu' ]]; then 2025-12-04T10:34:01.4002086Z  TEST_COMMAND=.ci/pytorch/multigpu-test.sh 2025-12-04T10:34:01.4002250Z elif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then 2025-12-04T10:34:01.4002411Z  TEST_COMMAND=.ci/caffe2/test.sh 2025-12-04T10:34:01.4002552Z else 2025-12-04T10:34:01.4002663Z  TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T10:34:01.4002789Z fi 2025-12-04T10:34:01.4002881Z  2025-12-04T10:34:01.4003017Z # detached container should get cleaned up by teardown_ec2_linux 2025-12-04T10:34:01.4003223Z # TODO: Stop building test binaries as part of the build phase 2025-12-04T10:34:01.4003407Z # Used for GPU_FLAG since that doesn't play nice 2025-12-04T10:34:01.4003580Z # shellcheck disable=SC2086,SC2090 2025-12-04T10:34:01.4003719Z container_name=$(docker run \ 2025-12-04T10:34:01.4003850Z  ${GPU_FLAG:-} \ 2025-12-04T10:34:01.4003972Z  -e BUILD_ENVIRONMENT \ 2025-12-04T10:34:01.4004098Z  -e PR_NUMBER \ 2025-12-04T10:34:01.4004214Z  -e GITHUB_ACTIONS \ 2025-12-04T10:34:01.4004334Z  -e GITHUB_REPOSITORY \ 2025-12-04T10:34:01.4014603Z  -e GITHUB_WORKFLOW \ 2025-12-04T10:34:01.4014747Z  -e GITHUB_JOB \ 2025-12-04T10:34:01.4015003Z  -e GITHUB_RUN_ID \ 2025-12-04T10:34:01.4015125Z  -e GITHUB_RUN_NUMBER \ 2025-12-04T10:34:01.4015255Z  -e GITHUB_RUN_ATTEMPT \ 2025-12-04T10:34:01.4015379Z  -e JOB_ID \ 2025-12-04T10:34:01.4015485Z  -e JOB_NAME \ 2025-12-04T10:34:01.4015597Z  -e BASE_SHA \ 2025-12-04T10:34:01.4015706Z  -e BRANCH \ 2025-12-04T10:34:01.4015810Z  -e SHA1 \ 2025-12-04T10:34:01.4015926Z  -e AWS_DEFAULT_REGION \ 2025-12-04T10:34:01.4016050Z  -e IN_WHEEL_TEST \ 2025-12-04T10:34:01.4016169Z  -e SHARD_NUMBER \ 2025-12-04T10:34:01.4016287Z  -e TEST_CONFIG \ 2025-12-04T10:34:01.4016403Z  -e NUM_TEST_SHARDS \ 2025-12-04T10:34:01.4016526Z  -e REENABLED_ISSUES \ 2025-12-04T10:34:01.4016654Z  -e CONTINUE_THROUGH_ERROR \ 2025-12-04T10:34:01.4016784Z  -e VERBOSE_TEST_LOGS \ 2025-12-04T10:34:01.4016907Z  -e TEST_SHOWLOCALS \ 2025-12-04T10:34:01.4017033Z  -e NO_TEST_TIMEOUT \ 2025-12-04T10:34:01.4017152Z  -e NO_TD \ 2025-12-04T10:34:01.4017273Z  -e MAX_JOBS="$(nproc --ignore=2)" \ 2025-12-04T10:34:01.4017424Z  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \ 2025-12-04T10:34:01.4017573Z  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \ 2025-12-04T10:34:01.4017712Z  -e TESTS_TO_INCLUDE \ 2025-12-04T10:34:01.4017837Z  -e HUGGING_FACE_HUB_TOKEN \ 2025-12-04T10:34:01.4017964Z  -e DASHBOARD_TAG \ 2025-12-04T10:34:01.4018116Z  --env-file="${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T10:34:01.4018282Z  --ulimit stack=10485760:83886080 \ 2025-12-04T10:34:01.4018411Z  --ulimit core=0 \ 2025-12-04T10:34:01.4018549Z  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \ 2025-12-04T10:34:01.4018707Z  --security-opt seccomp=unconfined \ 2025-12-04T10:34:01.4018848Z  --cap-add=SYS_PTRACE \ 2025-12-04T10:34:01.4018977Z  --shm-size="8g" \ 2025-12-04T10:34:01.4019090Z  --tty \ 2025-12-04T10:34:01.4019192Z  --detach \ 2025-12-04T10:34:01.4019307Z  --name="${container_name}" \ 2025-12-04T10:34:01.4019434Z  --user jenkins \ 2025-12-04T10:34:01.4019624Z  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \ 2025-12-04T10:34:01.4019784Z  -w /var/lib/jenkins/workspace \ 2025-12-04T10:34:01.4019975Z  "${DOCKER_IMAGE}" 2025-12-04T10:34:01.4020084Z ) 2025-12-04T10:34:01.4020193Z # save container name for later step 2025-12-04T10:34:01.4020359Z echo "CONTAINER_NAME=${container_name}" >> "$GITHUB_ENV" 2025-12-04T10:34:01.4020636Z # jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home 2025-12-04T10:34:01.4020988Z docker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}" 2025-12-04T10:34:01.4024090Z shell: /usr/bin/bash -e {0} 2025-12-04T10:34:01.4024206Z env: 2025-12-04T10:34:01.4024305Z GIT_DEFAULT_BRANCH: main 2025-12-04T10:34:01.4024448Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T10:34:01.4024630Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T10:34:01.4024800Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T10:34:01.4025315Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T10:34:01.4025808Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T10:34:01.4025927Z AWS_REGION: us-east-1 2025-12-04T10:34:01.4026075Z AWS_ACCESS_KEY_ID: *** 2025-12-04T10:34:01.4026281Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T10:34:01.4028277Z AWS_SESSION_TOKEN: *** 2025-12-04T10:34:01.4028405Z BUILD_ENVIRONMENT: linux-jammy-rocm-py3.10 2025-12-04T10:34:01.4028538Z PR_NUMBER: 2025-12-04T10:34:01.4028644Z GITHUB_REPOSITORY: pytorch/pytorch 2025-12-04T10:34:01.4028777Z GITHUB_WORKFLOW: trunk-rocm-mi300 2025-12-04T10:34:01.4028899Z GITHUB_JOB: test 2025-12-04T10:34:01.4029003Z GITHUB_RUN_ID: 19922849170 2025-12-04T10:34:01.4029118Z GITHUB_RUN_NUMBER: 689 2025-12-04T10:34:01.4029233Z GITHUB_RUN_ATTEMPT: 1 2025-12-04T10:34:01.4029338Z JOB_ID: 57116213187 2025-12-04T10:34:01.4029545Z JOB_NAME: linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:34:01.4029805Z BRANCH: main 2025-12-04T10:34:01.4029919Z SHA1: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:01.4030074Z BASE_SHA: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:01.4030213Z TEST_CONFIG: distributed 2025-12-04T10:34:01.4030326Z SHARD_NUMBER: 2 2025-12-04T10:34:01.4030424Z NUM_TEST_SHARDS: 3 2025-12-04T10:34:01.4030529Z REENABLED_ISSUES: 2025-12-04T10:34:01.4030634Z CONTINUE_THROUGH_ERROR: True 2025-12-04T10:34:01.4030750Z VERBOSE_TEST_LOGS: False 2025-12-04T10:34:01.4030858Z TEST_SHOWLOCALS: False 2025-12-04T10:34:01.4030965Z NO_TEST_TIMEOUT: False 2025-12-04T10:34:01.4031067Z NO_TD: False 2025-12-04T10:34:01.4031340Z DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:34:01.4031637Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 1 2025-12-04T10:34:01.4031769Z PYTORCH_TEST_RERUN_DISABLED_TESTS: 0 2025-12-04T10:34:01.4031891Z TESTS_TO_INCLUDE: 2025-12-04T10:34:01.4031993Z DASHBOARD_TAG: 2025-12-04T10:34:01.4032135Z HUGGING_FACE_HUB_TOKEN: *** 2025-12-04T10:34:01.4032249Z ##[endgroup] 2025-12-04T10:34:01.4048560Z + [[ distributed == \m\u\l\t\i\g\p\u ]] 2025-12-04T10:34:01.4048998Z + [[ linux-jammy-rocm-py3.10 == *onnx* ]] 2025-12-04T10:34:01.4049372Z + TEST_COMMAND=.ci/pytorch/test.sh 2025-12-04T10:34:01.4055398Z +++ nproc --ignore=2 2025-12-04T10:34:01.4065519Z ++ docker run --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e MAX_JOBS=126 -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e TESTS_TO_INCLUDE -e HUGGING_FACE_HUB_TOKEN -e DASHBOARD_TAG --env-file=/home/runner/_work/_temp/github_env_19922849170 --ulimit stack=10485760:83886080 --ulimit core=0 --env-file=/tmp/github_env_19922849170 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --shm-size=8g --tty --detach --name= --user jenkins -v /home/runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-jammy-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a 2025-12-04T10:34:01.5933874Z + container_name=8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T10:34:01.5934267Z + echo CONTAINER_NAME=8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T10:34:01.5934851Z + docker exec -t 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c sh -c 'cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && .ci/pytorch/test.sh' 2025-12-04T10:34:04.8557576Z Processing ./dist/torch-2.10.0a0+gitffd9b0f-cp310-cp310-linux_x86_64.whl 2025-12-04T10:34:05.4105025Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.18.0) 2025-12-04T10:34:05.4106313Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (4.12.2) 2025-12-04T10:34:05.4107876Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (1.13.3) 2025-12-04T10:34:05.4110374Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (2.8.8) 2025-12-04T10:34:05.4112621Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.1.6) 2025-12-04T10:34:05.4113701Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from torch==2.10.0a0+gitffd9b0f) (2025.10.0) 2025-12-04T10:34:05.4274394Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from sympy>=1.13.3->torch==2.10.0a0+gitffd9b0f) (1.3.0) 2025-12-04T10:34:05.4298092Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from jinja2->torch==2.10.0a0+gitffd9b0f) (3.0.3) 2025-12-04T10:34:05.6241484Z Installing collected packages: torch 2025-12-04T10:34:11.1293427Z Successfully installed torch-2.10.0a0+gitffd9b0f 2025-12-04T10:34:11.1676440Z + export TERM=vt100 2025-12-04T10:34:11.1676843Z + TERM=vt100 2025-12-04T10:34:11.1680009Z ++ dirname .ci/pytorch/test.sh 2025-12-04T10:34:11.1696934Z + source .ci/pytorch/common.sh 2025-12-04T10:34:11.1699496Z +++ dirname .ci/pytorch/common.sh 2025-12-04T10:34:11.1707856Z ++ source .ci/pytorch/common_utils.sh 2025-12-04T10:34:11.1709865Z +++ declare -f -t trap_add 2025-12-04T10:34:11.1715920Z ++ set -ex -o pipefail 2025-12-04T10:34:11.1716229Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T10:34:11.1716487Z ++ unset HIP_PLATFORM 2025-12-04T10:34:11.1716695Z ++ export PYTORCH_TEST_WITH_ROCM=1 2025-12-04T10:34:11.1716928Z ++ PYTORCH_TEST_WITH_ROCM=1 2025-12-04T10:34:11.1717147Z ++ BUILD_TEST_LIBTORCH=0 2025-12-04T10:34:11.1721129Z ++ dirname .ci/pytorch/test.sh 2025-12-04T10:34:11.1728325Z + source .ci/pytorch/common-build.sh 2025-12-04T10:34:11.1729857Z ++ [[ linux-jammy-rocm-py3.10 != *win-* ]] 2025-12-04T10:34:11.1736243Z ++++ dirname .ci/pytorch/common-build.sh 2025-12-04T10:34:11.1745710Z +++ cd .ci/pytorch 2025-12-04T10:34:11.1745895Z +++ pwd -P 2025-12-04T10:34:11.1748276Z ++ script_dir=/var/lib/jenkins/pytorch/.ci/pytorch 2025-12-04T10:34:11.1748560Z ++ [[ linux-jammy-rocm-py3.10 == *-pch* ]] 2025-12-04T10:34:11.1748751Z ++ which sccache 2025-12-04T10:34:11.1760989Z ++ [[ -z '' ]] 2025-12-04T10:34:11.1761199Z ++ unset SCCACHE_BUCKET 2025-12-04T10:34:11.1761353Z ++ unset SCCACHE_REGION 2025-12-04T10:34:11.1761513Z ++ sccache --stop-server 2025-12-04T10:34:11.1782475Z ++ true 2025-12-04T10:34:11.1782649Z ++ rm -f /var/lib/jenkins/sccache_error.log 2025-12-04T10:34:11.1792683Z ++ trap_add sccache_epilogue EXIT 2025-12-04T10:34:11.1792881Z ++ trap_add_cmd=sccache_epilogue 2025-12-04T10:34:11.1793042Z ++ shift 2025-12-04T10:34:11.1793191Z ++ for trap_add_name in "$@" 2025-12-04T10:34:11.1800844Z ++++ trap -p EXIT 2025-12-04T10:34:11.1803144Z +++ eval 'extract_trap_cmd ' 2025-12-04T10:34:11.1803320Z ++++ extract_trap_cmd 2025-12-04T10:34:11.1803464Z ++++ printf '%s\n' '' 2025-12-04T10:34:11.1804002Z +++ printf '%s\n' sccache_epilogue 2025-12-04T10:34:11.1805936Z ++ trap -- ' 2025-12-04T10:34:11.1806078Z sccache_epilogue' EXIT 2025-12-04T10:34:11.1806226Z ++ [[ -n '' ]] 2025-12-04T10:34:11.1806382Z ++ [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T10:34:11.1806620Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 2025-12-04T10:34:11.1806826Z ++ SCCACHE_IDLE_TIMEOUT=0 2025-12-04T10:34:11.1806986Z ++ sccache --start-server 2025-12-04T10:34:11.1824585Z sccache: Starting the server... 2025-12-04T10:34:11.2010615Z sccache: Listening on address 127.0.0.1:4226 2025-12-04T10:34:11.2018176Z ++ sccache --zero-stats 2025-12-04T10:34:11.2033411Z Statistics zeroed. 2025-12-04T10:34:11.2035621Z ++ which ccache 2025-12-04T10:34:11.2043370Z + [[ linux-jammy-rocm-py3.10 != *rocm* ]] 2025-12-04T10:34:11.2043576Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T10:34:11.2043757Z + echo 'Environment variables:' 2025-12-04T10:34:11.2043927Z Environment variables: 2025-12-04T10:34:11.2044069Z + env 2025-12-04T10:34:11.2050582Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch 2025-12-04T10:34:11.2050786Z CONTINUE_THROUGH_ERROR=True 2025-12-04T10:34:11.2050977Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-12-04T10:34:11.2051204Z HOSTNAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-5l4hk 2025-12-04T10:34:11.2051513Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2051780Z GITHUB_ACTION=__run_2 2025-12-04T10:34:11.2051929Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T10:34:11.2052082Z GITHUB_RUN_NUMBER=689 2025-12-04T10:34:11.2052218Z TEST_CONFIG=distributed 2025-12-04T10:34:11.2052392Z RUNNER_NAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-5l4hk 2025-12-04T10:34:11.2052595Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T10:34:11.2052756Z AWS_DEFAULT_REGION=us-east-1 2025-12-04T10:34:11.2052939Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts 2025-12-04T10:34:11.2053141Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-12-04T10:34:11.2053305Z GITHUB_REF_TYPE=branch 2025-12-04T10:34:11.2053516Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:11.2053890Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T10:34:11.2056666Z *** 2025-12-04T10:34:11.2056790Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T10:34:11.2056941Z GITHUB_ACTIONS=true 2025-12-04T10:34:11.2057092Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:11.2057286Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:11.2057563Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/trunk-rocm-mi300.yml@refs/heads/main 2025-12-04T10:34:11.2057808Z UCC_HOME=/usr 2025-12-04T10:34:11.2057925Z RUNNER_ENVIRONMENT=self-hosted 2025-12-04T10:34:11.2058065Z VERBOSE_TEST_LOGS=False 2025-12-04T10:34:11.2058187Z GITHUB_REF=refs/heads/main 2025-12-04T10:34:11.2058313Z RUNNER_OS=Linux 2025-12-04T10:34:11.2058420Z SHARD_NUMBER=2 2025-12-04T10:34:11.2058535Z GITHUB_REF_PROTECTED=true 2025-12-04T10:34:11.2058814Z RUNNER_MANUALLY_TRAP_SIG=1 2025-12-04T10:34:11.2058939Z HOME=/var/lib/jenkins 2025-12-04T10:34:11.2059083Z GITHUB_API_URL=https://api.github.com 2025-12-04T10:34:11.2059241Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T10:34:11.2059401Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs 2025-12-04T10:34:11.2059547Z LANG=C.UTF-8 2025-12-04T10:34:11.2059725Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-12-04T10:34:11.2059890Z PYTORCH_TEST_WITH_ROCM=1 2025-12-04T10:34:11.2060063Z RUNNER_TRACKING_ID=github_4b208c78-f2ba-477a-8e64-14a9af1f4823 2025-12-04T10:34:11.2060234Z RUNNER_ARCH=X64 2025-12-04T10:34:11.2060360Z RUNNER_TEMP=/home/runner/_work/_temp 2025-12-04T10:34:11.2060494Z NUM_TEST_SHARDS=3 2025-12-04T10:34:11.2060607Z UCX_HOME=/usr 2025-12-04T10:34:11.2060826Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2061188Z JOB_NAME=linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:34:11.2061436Z MAGMA_HOME=/opt/rocm/magma 2025-12-04T10:34:11.2061654Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2061940Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json 2025-12-04T10:34:11.2062125Z GITHUB_EVENT_NAME=schedule 2025-12-04T10:34:11.2062312Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1 2025-12-04T10:34:11.2062505Z DASHBOARD_TAG= 2025-12-04T10:34:11.2062666Z GITHUB_RUN_ID=19922849170 2025-12-04T10:34:11.2062904Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2063164Z GITHUB_ACTOR=pytorchmergebot 2025-12-04T10:34:11.2063295Z PR_NUMBER= 2025-12-04T10:34:11.2063403Z GITHUB_RUN_ATTEMPT=1 2025-12-04T10:34:11.2063532Z ANACONDA_PYTHON_VERSION=3.10 2025-12-04T10:34:11.2063688Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T10:34:11.2063854Z TERM=vt100 2025-12-04T10:34:11.2063961Z INSTALLED_VISION=yes 2025-12-04T10:34:11.2064078Z BRANCH=main 2025-12-04T10:34:11.2064199Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T10:34:11.2064324Z TESTS_TO_INCLUDE= 2025-12-04T10:34:11.2064513Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-12-04T10:34:11.2064731Z GITHUB_SERVER_URL=https://github.com 2025-12-04T10:34:11.2064889Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100 2025-12-04T10:34:11.2065066Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-12-04T10:34:11.2065227Z REENABLED_ISSUES= 2025-12-04T10:34:11.2065335Z SHLVL=1 2025-12-04T10:34:11.2065435Z MAX_JOBS=126 2025-12-04T10:34:11.2065586Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results 2025-12-04T10:34:11.2065763Z GITHUB_ACTOR_ID=97764156 2025-12-04T10:34:11.2065896Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool 2025-12-04T10:34:11.2066075Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:11.2066249Z GITHUB_REF_NAME=main 2025-12-04T10:34:11.2066371Z ROCM_PATH=/opt/rocm 2025-12-04T10:34:11.2066490Z GITHUB_JOB=test 2025-12-04T10:34:11.2066604Z NO_TEST_TIMEOUT=False 2025-12-04T10:34:11.2066732Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T10:34:11.2066870Z LC_ALL=C.UTF-8 2025-12-04T10:34:11.2066982Z GITHUB_RETENTION_DAYS=90 2025-12-04T10:34:11.2067123Z RUNNER_WORKSPACE=/home/runner/_work/pytorch 2025-12-04T10:34:11.2067276Z OPENSSL_DIR=/opt/openssl 2025-12-04T10:34:11.2067403Z GITHUB_ACTION_REPOSITORY= 2025-12-04T10:34:11.2067827Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T10:34:11.2068208Z GITHUB_BASE_REF= 2025-12-04T10:34:11.2068306Z CI=true 2025-12-04T10:34:11.2068403Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T10:34:11.2068521Z JOB_ID=57116213187 2025-12-04T10:34:11.2068618Z GITHUB_HEAD_REF= 2025-12-04T10:34:11.2068715Z GITHUB_ACTION_REF= 2025-12-04T10:34:11.2068859Z TEST_SHOWLOCALS=False 2025-12-04T10:34:11.2068978Z GITHUB_WORKFLOW=trunk-rocm-mi300 2025-12-04T10:34:11.2069106Z DEBIAN_FRONTEND=noninteractive 2025-12-04T10:34:11.2069323Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2069538Z NO_TD=False 2025-12-04T10:34:11.2069702Z OLDPWD=/var/lib/jenkins 2025-12-04T10:34:11.2069806Z _=/usr/bin/env 2025-12-04T10:34:11.2069950Z ++ python -c 'import site; print(site.getsitepackages()[0])' 2025-12-04T10:34:11.2117750Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch 2025-12-04T10:34:11.2117998Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/bin 2025-12-04T10:34:11.2118216Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/lib 2025-12-04T10:34:11.2118433Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/test 2025-12-04T10:34:11.2118602Z + BUILD_DIR=build 2025-12-04T10:34:11.2118710Z + BUILD_RENAMED_DIR=build_renamed 2025-12-04T10:34:11.2118831Z + BUILD_BIN_DIR=build/bin 2025-12-04T10:34:11.2118938Z + SHARD_NUMBER=2 2025-12-04T10:34:11.2120349Z + NUM_TEST_SHARDS=3 2025-12-04T10:34:11.2120539Z + export TORCH_SERIALIZATION_DEBUG=1 2025-12-04T10:34:11.2120698Z + TORCH_SERIALIZATION_DEBUG=1 2025-12-04T10:34:11.2120831Z + export VALGRIND=ON 2025-12-04T10:34:11.2120950Z + VALGRIND=ON 2025-12-04T10:34:11.2121086Z + [[ linux-jammy-rocm-py3.10 == *clang9* ]] 2025-12-04T10:34:11.2121496Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-12-04T10:34:11.2121631Z + detect_cuda_arch 2025-12-04T10:34:11.2121754Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T10:34:11.2121904Z + [[ linux-jammy-rocm-py3.10 == *s390x* ]] 2025-12-04T10:34:11.2122038Z + [[ 0 == \1 ]] 2025-12-04T10:34:11.2122143Z + [[ True == \1 ]] 2025-12-04T10:34:11.2122265Z + [[ linux-jammy-rocm-py3.10 != *bazel* ]] 2025-12-04T10:34:11.2122834Z ++ realpath build/custom_test_artifacts 2025-12-04T10:34:11.2128907Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/pytorch/build/custom_test_artifacts 2025-12-04T10:34:11.2129342Z + [[ -n '' ]] 2025-12-04T10:34:11.2129560Z + echo 'Environment variables' 2025-12-04T10:34:11.2129844Z Environment variables 2025-12-04T10:34:11.2130038Z + env 2025-12-04T10:34:11.2134530Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch 2025-12-04T10:34:11.2134856Z CONTINUE_THROUGH_ERROR=True 2025-12-04T10:34:11.2135118Z BUILD_ENVIRONMENT=linux-jammy-rocm-py3.10 2025-12-04T10:34:11.2135458Z HOSTNAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-5l4hk 2025-12-04T10:34:11.2135947Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2136366Z GITHUB_ACTION=__run_2 2025-12-04T10:34:11.2136591Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 2025-12-04T10:34:11.2136845Z GITHUB_RUN_NUMBER=689 2025-12-04T10:34:11.2137050Z TEST_CONFIG=distributed 2025-12-04T10:34:11.2137327Z RUNNER_NAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-5l4hk 2025-12-04T10:34:11.2137644Z GITHUB_REPOSITORY_OWNER_ID=21003710 2025-12-04T10:34:11.2137882Z AWS_DEFAULT_REGION=us-east-1 2025-12-04T10:34:11.2138133Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts 2025-12-04T10:34:11.2138404Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot 2025-12-04T10:34:11.2138633Z GITHUB_REF_TYPE=branch 2025-12-04T10:34:11.2138876Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:11.2139319Z HUGGING_FACE_HUB_TOKEN=*** 2025-12-04T10:34:11.2139635Z *** 2025-12-04T10:34:11.2139806Z GITHUB_REPOSITORY_ID=65600975 2025-12-04T10:34:11.2140007Z GITHUB_ACTIONS=true 2025-12-04T10:34:11.2140221Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:11.2140495Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:11.2140897Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/trunk-rocm-mi300.yml@refs/heads/main 2025-12-04T10:34:11.2141250Z UCC_HOME=/usr 2025-12-04T10:34:11.2141429Z TORCH_SERIALIZATION_DEBUG=1 2025-12-04T10:34:11.2141642Z RUNNER_ENVIRONMENT=self-hosted 2025-12-04T10:34:11.2142176Z VERBOSE_TEST_LOGS=False 2025-12-04T10:34:11.2142374Z GITHUB_REF=refs/heads/main 2025-12-04T10:34:11.2142559Z RUNNER_OS=Linux 2025-12-04T10:34:11.2142726Z SHARD_NUMBER=2 2025-12-04T10:34:11.2142898Z GITHUB_REF_PROTECTED=true 2025-12-04T10:34:11.2143090Z RUNNER_MANUALLY_TRAP_SIG=1 2025-12-04T10:34:11.2143279Z HOME=/var/lib/jenkins 2025-12-04T10:34:11.2143496Z GITHUB_API_URL=https://api.github.com 2025-12-04T10:34:11.2143740Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0 2025-12-04T10:34:11.2143984Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs 2025-12-04T10:34:11.2144217Z LANG=C.UTF-8 2025-12-04T10:34:11.2144423Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e 2025-12-04T10:34:11.2144676Z PYTORCH_TEST_WITH_ROCM=1 2025-12-04T10:34:11.2144937Z RUNNER_TRACKING_ID=github_4b208c78-f2ba-477a-8e64-14a9af1f4823 2025-12-04T10:34:11.2145206Z RUNNER_ARCH=X64 2025-12-04T10:34:11.2145385Z RUNNER_TEMP=/home/runner/_work/_temp 2025-12-04T10:34:11.2145601Z NUM_TEST_SHARDS=3 2025-12-04T10:34:11.2145775Z UCX_HOME=/usr 2025-12-04T10:34:11.2146121Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2146694Z JOB_NAME=linux-jammy-rocm-py3.10 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, mem_leak_check, unstable) 2025-12-04T10:34:11.2147088Z MAGMA_HOME=/opt/rocm/magma 2025-12-04T10:34:11.2147436Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2147908Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json 2025-12-04T10:34:11.2148124Z GITHUB_EVENT_NAME=schedule 2025-12-04T10:34:11.2148331Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1 2025-12-04T10:34:11.2148550Z DASHBOARD_TAG= 2025-12-04T10:34:11.2148685Z GITHUB_RUN_ID=19922849170 2025-12-04T10:34:11.2148961Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2149272Z GITHUB_ACTOR=pytorchmergebot 2025-12-04T10:34:11.2149424Z PR_NUMBER= 2025-12-04T10:34:11.2149558Z GITHUB_RUN_ATTEMPT=1 2025-12-04T10:34:11.2149744Z VALGRIND=ON 2025-12-04T10:34:11.2149879Z ANACONDA_PYTHON_VERSION=3.10 2025-12-04T10:34:11.2150068Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql 2025-12-04T10:34:11.2150243Z TERM=vt100 2025-12-04T10:34:11.2150375Z INSTALLED_VISION=yes 2025-12-04T10:34:11.2150531Z BRANCH=main 2025-12-04T10:34:11.2150654Z OPENSSL_ROOT_DIR=/opt/openssl 2025-12-04T10:34:11.2150814Z TESTS_TO_INCLUDE= 2025-12-04T10:34:11.2151029Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm 2025-12-04T10:34:11.2151284Z GITHUB_SERVER_URL=https://github.com 2025-12-04T10:34:11.2151476Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100 2025-12-04T10:34:11.2151687Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77 2025-12-04T10:34:11.2151870Z REENABLED_ISSUES= 2025-12-04T10:34:11.2151995Z SHLVL=1 2025-12-04T10:34:11.2152117Z MAX_JOBS=126 2025-12-04T10:34:11.2152284Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results 2025-12-04T10:34:11.2152482Z GITHUB_ACTOR_ID=97764156 2025-12-04T10:34:11.2152638Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool 2025-12-04T10:34:11.2152851Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32 2025-12-04T10:34:11.2153053Z GITHUB_REF_NAME=main 2025-12-04T10:34:11.2153190Z ROCM_PATH=/opt/rocm 2025-12-04T10:34:11.2153312Z GITHUB_JOB=test 2025-12-04T10:34:11.2153443Z NO_TEST_TIMEOUT=False 2025-12-04T10:34:11.2153588Z GITHUB_REPOSITORY=pytorch/pytorch 2025-12-04T10:34:11.2153755Z LC_ALL=C.UTF-8 2025-12-04T10:34:11.2153879Z GITHUB_RETENTION_DAYS=90 2025-12-04T10:34:11.2154046Z RUNNER_WORKSPACE=/home/runner/_work/pytorch 2025-12-04T10:34:11.2154225Z OPENSSL_DIR=/opt/openssl 2025-12-04T10:34:11.2154373Z GITHUB_ACTION_REPOSITORY= 2025-12-04T10:34:11.2154906Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T10:34:11.2155396Z GITHUB_BASE_REF= 2025-12-04T10:34:11.2155519Z CI=true 2025-12-04T10:34:11.2155649Z GITHUB_REPOSITORY_OWNER=pytorch 2025-12-04T10:34:11.2155801Z JOB_ID=57116213187 2025-12-04T10:34:11.2155925Z GITHUB_HEAD_REF= 2025-12-04T10:34:11.2156056Z GITHUB_ACTION_REF= 2025-12-04T10:34:11.2156181Z TEST_SHOWLOCALS=False 2025-12-04T10:34:11.2156329Z GITHUB_WORKFLOW=trunk-rocm-mi300 2025-12-04T10:34:11.2156487Z DEBIAN_FRONTEND=noninteractive 2025-12-04T10:34:11.2156768Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_c2a8d1b0-3c11-4303-ae05-bde093f6a837 2025-12-04T10:34:11.2157044Z NO_TD=False 2025-12-04T10:34:11.2157160Z OLDPWD=/var/lib/jenkins 2025-12-04T10:34:11.2157302Z _=/usr/bin/env 2025-12-04T10:34:11.2157434Z + echo 'Testing pytorch' 2025-12-04T10:34:11.2157570Z Testing pytorch 2025-12-04T10:34:11.2157755Z + export LANG=C.UTF-8 2025-12-04T10:34:11.2157883Z + LANG=C.UTF-8 2025-12-04T10:34:11.2157981Z + PR_NUMBER= 2025-12-04T10:34:11.2158095Z + [[ distributed == \d\e\f\a\u\l\t ]] 2025-12-04T10:34:11.2158237Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]] 2025-12-04T10:34:11.2158386Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T10:34:11.2158531Z + export HIP_VISIBLE_DEVICES=0,1,2,3 2025-12-04T10:34:11.2158670Z + HIP_VISIBLE_DEVICES=0,1,2,3 2025-12-04T10:34:11.2158798Z + [[ distributed == \s\l\o\w ]] 2025-12-04T10:34:11.2158946Z + [[ linux-jammy-rocm-py3.10 == *slow-gradcheck* ]] 2025-12-04T10:34:11.2159105Z + [[ linux-jammy-rocm-py3.10 == *cuda* ]] 2025-12-04T10:34:11.2159321Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T10:34:11.2159472Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T10:34:11.2159676Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda 2025-12-04T10:34:11.2159822Z + [[ distributed == *crossref* ]] 2025-12-04T10:34:11.2159962Z + [[ linux-jammy-rocm-py3.10 == *rocm* ]] 2025-12-04T10:34:11.2160092Z + export VALGRIND=OFF 2025-12-04T10:34:11.2160207Z + VALGRIND=OFF 2025-12-04T10:34:11.2160309Z + rocminfo 2025-12-04T10:34:11.2254702Z ROCk module version 6.12.12 is loaded 2025-12-04T10:34:11.2979993Z ===================== 2025-12-04T10:34:11.2980436Z HSA System Attributes 2025-12-04T10:34:11.2980727Z ===================== 2025-12-04T10:34:11.2981011Z Runtime Version: 1.18 2025-12-04T10:34:11.2981315Z Runtime Ext Version: 1.14 2025-12-04T10:34:11.2981648Z System Timestamp Freq.: 1000.000000MHz 2025-12-04T10:34:11.2982191Z Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) 2025-12-04T10:34:11.2982774Z Machine Model: LARGE 2025-12-04T10:34:11.2983234Z System Endianness: LITTLE 2025-12-04T10:34:11.2983629Z Mwaitx: DISABLED 2025-12-04T10:34:11.2983954Z XNACK enabled: NO 2025-12-04T10:34:11.2984261Z DMAbuf Support: YES 2025-12-04T10:34:11.2984557Z VMM Support: YES 2025-12-04T10:34:11.2984775Z 2025-12-04T10:34:11.2984881Z ========== 2025-12-04T10:34:11.2985170Z HSA Agents 2025-12-04T10:34:11.2985440Z ========== 2025-12-04T10:34:11.2985723Z ******* 2025-12-04T10:34:11.2985994Z Agent 1 2025-12-04T10:34:11.2986275Z ******* 2025-12-04T10:34:11.2986605Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:34:11.2987032Z Uuid: CPU-XX 2025-12-04T10:34:11.2987458Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:34:11.2987916Z Vendor Name: CPU 2025-12-04T10:34:11.2988359Z Feature: None specified 2025-12-04T10:34:11.2988768Z Profile: FULL_PROFILE 2025-12-04T10:34:11.2989095Z Float Round Mode: NEAR 2025-12-04T10:34:11.2989304Z Max Queue Number: 0(0x0) 2025-12-04T10:34:11.2989740Z Queue Min Size: 0(0x0) 2025-12-04T10:34:11.2989940Z Queue Max Size: 0(0x0) 2025-12-04T10:34:11.2990131Z Queue Type: MULTI 2025-12-04T10:34:11.2990318Z Node: 0 2025-12-04T10:34:11.2990527Z Device Type: CPU 2025-12-04T10:34:11.2990731Z Cache Info: 2025-12-04T10:34:11.2990889Z L1: 49152(0xc000) KB 2025-12-04T10:34:11.2991075Z Chip ID: 0(0x0) 2025-12-04T10:34:11.2991265Z ASIC Revision: 0(0x0) 2025-12-04T10:34:11.2991484Z Cacheline Size: 64(0x40) 2025-12-04T10:34:11.2991684Z Max Clock Freq. (MHz): 3300 2025-12-04T10:34:11.2991886Z BDFID: 0 2025-12-04T10:34:11.2992085Z Internal Node ID: 0 2025-12-04T10:34:11.2992291Z Compute Unit: 64 2025-12-04T10:34:11.2992493Z SIMDs per CU: 0 2025-12-04T10:34:11.2992714Z Shader Engines: 0 2025-12-04T10:34:11.2992916Z Shader Arrs. per Eng.: 0 2025-12-04T10:34:11.2993182Z WatchPts on Addr. Ranges:1 2025-12-04T10:34:11.2993401Z Memory Properties: 2025-12-04T10:34:11.2993544Z Features: None 2025-12-04T10:34:11.2993696Z Pool Info: 2025-12-04T10:34:11.2993835Z Pool 1 2025-12-04T10:34:11.2994008Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:34:11.2994208Z Size: 1584776988(0x5e75c71c) KB 2025-12-04T10:34:11.2994407Z Allocatable: TRUE 2025-12-04T10:34:11.2994609Z Alloc Granule: 4KB 2025-12-04T10:34:11.2994822Z Alloc Recommended Granule:4KB 2025-12-04T10:34:11.2995046Z Alloc Alignment: 4KB 2025-12-04T10:34:11.2995257Z Accessible by all: TRUE 2025-12-04T10:34:11.2995435Z Pool 2 2025-12-04T10:34:11.2995617Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:34:11.2995813Z Size: 1584776988(0x5e75c71c) KB 2025-12-04T10:34:11.2996007Z Allocatable: TRUE 2025-12-04T10:34:11.2996217Z Alloc Granule: 4KB 2025-12-04T10:34:11.2996442Z Alloc Recommended Granule:4KB 2025-12-04T10:34:11.2996655Z Alloc Alignment: 4KB 2025-12-04T10:34:11.2996861Z Accessible by all: TRUE 2025-12-04T10:34:11.2997041Z Pool 3 2025-12-04T10:34:11.2997223Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T10:34:11.2997410Z Size: 1584776988(0x5e75c71c) KB 2025-12-04T10:34:11.2997602Z Allocatable: TRUE 2025-12-04T10:34:11.2997818Z Alloc Granule: 4KB 2025-12-04T10:34:11.2998032Z Alloc Recommended Granule:4KB 2025-12-04T10:34:11.2998293Z Alloc Alignment: 4KB 2025-12-04T10:34:11.2998496Z Accessible by all: TRUE 2025-12-04T10:34:11.2998673Z Pool 4 2025-12-04T10:34:11.2998908Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:34:11.2999101Z Size: 1584776988(0x5e75c71c) KB 2025-12-04T10:34:11.2999294Z Allocatable: TRUE 2025-12-04T10:34:11.2999547Z Alloc Granule: 4KB 2025-12-04T10:34:11.2999766Z Alloc Recommended Granule:4KB 2025-12-04T10:34:11.2999941Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3000113Z Accessible by all: TRUE 2025-12-04T10:34:11.3000263Z ISA Info: 2025-12-04T10:34:11.3000398Z ******* 2025-12-04T10:34:11.3000521Z Agent 2 2025-12-04T10:34:11.3000630Z ******* 2025-12-04T10:34:11.3000759Z Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:34:11.3000914Z Uuid: CPU-XX 2025-12-04T10:34:11.3001097Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:34:11.3001265Z Vendor Name: CPU 2025-12-04T10:34:11.3001438Z Feature: None specified 2025-12-04T10:34:11.3001600Z Profile: FULL_PROFILE 2025-12-04T10:34:11.3001772Z Float Round Mode: NEAR 2025-12-04T10:34:11.3001946Z Max Queue Number: 0(0x0) 2025-12-04T10:34:11.3002164Z Queue Min Size: 0(0x0) 2025-12-04T10:34:11.3002326Z Queue Max Size: 0(0x0) 2025-12-04T10:34:11.3002486Z Queue Type: MULTI 2025-12-04T10:34:11.3002675Z Node: 1 2025-12-04T10:34:11.3002829Z Device Type: CPU 2025-12-04T10:34:11.3003086Z Cache Info: 2025-12-04T10:34:11.3003214Z L1: 49152(0xc000) KB 2025-12-04T10:34:11.3003361Z Chip ID: 0(0x0) 2025-12-04T10:34:11.3003531Z ASIC Revision: 0(0x0) 2025-12-04T10:34:11.3003696Z Cacheline Size: 64(0x40) 2025-12-04T10:34:11.3003869Z Max Clock Freq. (MHz): 3300 2025-12-04T10:34:11.3004034Z BDFID: 0 2025-12-04T10:34:11.3004191Z Internal Node ID: 1 2025-12-04T10:34:11.3004356Z Compute Unit: 64 2025-12-04T10:34:11.3004515Z SIMDs per CU: 0 2025-12-04T10:34:11.3004674Z Shader Engines: 0 2025-12-04T10:34:11.3004840Z Shader Arrs. per Eng.: 0 2025-12-04T10:34:11.3005008Z WatchPts on Addr. Ranges:1 2025-12-04T10:34:11.3005158Z Memory Properties: 2025-12-04T10:34:11.3005278Z Features: None 2025-12-04T10:34:11.3005434Z Pool Info: 2025-12-04T10:34:11.3005578Z Pool 1 2025-12-04T10:34:11.3005732Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:34:11.3005899Z Size: 1585311804(0x5e7df03c) KB 2025-12-04T10:34:11.3006058Z Allocatable: TRUE 2025-12-04T10:34:11.3006228Z Alloc Granule: 4KB 2025-12-04T10:34:11.3006409Z Alloc Recommended Granule:4KB 2025-12-04T10:34:11.3006584Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3006752Z Accessible by all: TRUE 2025-12-04T10:34:11.3006947Z Pool 2 2025-12-04T10:34:11.3007097Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:34:11.3007257Z Size: 1585311804(0x5e7df03c) KB 2025-12-04T10:34:11.3007452Z Allocatable: TRUE 2025-12-04T10:34:11.3007621Z Alloc Granule: 4KB 2025-12-04T10:34:11.3007814Z Alloc Recommended Granule:4KB 2025-12-04T10:34:11.3008004Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3008182Z Accessible by all: TRUE 2025-12-04T10:34:11.3008355Z Pool 3 2025-12-04T10:34:11.3008489Z Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED 2025-12-04T10:34:11.3008644Z Size: 1585311804(0x5e7df03c) KB 2025-12-04T10:34:11.3008802Z Allocatable: TRUE 2025-12-04T10:34:11.3008963Z Alloc Granule: 4KB 2025-12-04T10:34:11.3009129Z Alloc Recommended Granule:4KB 2025-12-04T10:34:11.3009295Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3009458Z Accessible by all: TRUE 2025-12-04T10:34:11.3009647Z Pool 4 2025-12-04T10:34:11.3009821Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:34:11.3009976Z Size: 1585311804(0x5e7df03c) KB 2025-12-04T10:34:11.3010130Z Allocatable: TRUE 2025-12-04T10:34:11.3010292Z Alloc Granule: 4KB 2025-12-04T10:34:11.3010482Z Alloc Recommended Granule:4KB 2025-12-04T10:34:11.3010671Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3010844Z Accessible by all: TRUE 2025-12-04T10:34:11.3010991Z ISA Info: 2025-12-04T10:34:11.3011224Z ******* 2025-12-04T10:34:11.3011327Z Agent 3 2025-12-04T10:34:11.3011471Z ******* 2025-12-04T10:34:11.3011622Z Name: gfx942 2025-12-04T10:34:11.3011780Z Uuid: GPU-41f9686c3d70a95c 2025-12-04T10:34:11.3011938Z Marketing Name: 2025-12-04T10:34:11.3012145Z Vendor Name: AMD 2025-12-04T10:34:11.3012304Z Feature: KERNEL_DISPATCH 2025-12-04T10:34:11.3012486Z Profile: BASE_PROFILE 2025-12-04T10:34:11.3012668Z Float Round Mode: NEAR 2025-12-04T10:34:11.3012833Z Max Queue Number: 128(0x80) 2025-12-04T10:34:11.3013003Z Queue Min Size: 64(0x40) 2025-12-04T10:34:11.3013200Z Queue Max Size: 131072(0x20000) 2025-12-04T10:34:11.3013355Z Queue Type: MULTI 2025-12-04T10:34:11.3013505Z Node: 2 2025-12-04T10:34:11.3013660Z Device Type: GPU 2025-12-04T10:34:11.3013804Z Cache Info: 2025-12-04T10:34:11.3013928Z L1: 32(0x20) KB 2025-12-04T10:34:11.3014068Z L2: 4096(0x1000) KB 2025-12-04T10:34:11.3014208Z L3: 262144(0x40000) KB 2025-12-04T10:34:11.3014353Z Chip ID: 29861(0x74a5) 2025-12-04T10:34:11.3014549Z ASIC Revision: 1(0x1) 2025-12-04T10:34:11.3014711Z Cacheline Size: 128(0x80) 2025-12-04T10:34:11.3014908Z Max Clock Freq. (MHz): 2100 2025-12-04T10:34:11.3015085Z BDFID: 29952 2025-12-04T10:34:11.3015241Z Internal Node ID: 2 2025-12-04T10:34:11.3015399Z Compute Unit: 304 2025-12-04T10:34:11.3015568Z SIMDs per CU: 4 2025-12-04T10:34:11.3015771Z Shader Engines: 32 2025-12-04T10:34:11.3015934Z Shader Arrs. per Eng.: 1 2025-12-04T10:34:11.3016102Z WatchPts on Addr. Ranges:4 2025-12-04T10:34:11.3016281Z Coherent Host Access: FALSE 2025-12-04T10:34:11.3016428Z Memory Properties: 2025-12-04T10:34:11.3016559Z Features: KERNEL_DISPATCH 2025-12-04T10:34:11.3016743Z Fast F16 Operation: TRUE 2025-12-04T10:34:11.3016913Z Wavefront Size: 64(0x40) 2025-12-04T10:34:11.3017077Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3017223Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3017352Z x 1024(0x400) 2025-12-04T10:34:11.3017521Z y 1024(0x400) 2025-12-04T10:34:11.3017648Z z 1024(0x400) 2025-12-04T10:34:11.3017790Z Max Waves Per CU: 32(0x20) 2025-12-04T10:34:11.3017947Z Max Work-item Per CU: 2048(0x800) 2025-12-04T10:34:11.3018106Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3018253Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3018373Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3030383Z y 65535(0xffff) 2025-12-04T10:34:11.3030536Z z 65535(0xffff) 2025-12-04T10:34:11.3030701Z Max fbarriers/Workgrp: 32 2025-12-04T10:34:11.3030939Z Packet Processor uCode:: 185 2025-12-04T10:34:11.3031122Z SDMA engine uCode:: 24 2025-12-04T10:34:11.3031292Z IOMMU Support:: None 2025-12-04T10:34:11.3031445Z Pool Info: 2025-12-04T10:34:11.3031568Z Pool 1 2025-12-04T10:34:11.3031718Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:34:11.3031890Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3032056Z Allocatable: TRUE 2025-12-04T10:34:11.3032230Z Alloc Granule: 4KB 2025-12-04T10:34:11.3032410Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3032589Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3032763Z Accessible by all: FALSE 2025-12-04T10:34:11.3032916Z Pool 2 2025-12-04T10:34:11.3033066Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:34:11.3033229Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3033388Z Allocatable: TRUE 2025-12-04T10:34:11.3033556Z Alloc Granule: 4KB 2025-12-04T10:34:11.3033729Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3033971Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3034145Z Accessible by all: FALSE 2025-12-04T10:34:11.3034294Z Pool 3 2025-12-04T10:34:11.3034434Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:34:11.3034592Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3034751Z Allocatable: TRUE 2025-12-04T10:34:11.3034922Z Alloc Granule: 4KB 2025-12-04T10:34:11.3035098Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3035271Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3035441Z Accessible by all: FALSE 2025-12-04T10:34:11.3035590Z Pool 4 2025-12-04T10:34:11.3035725Z Segment: GROUP 2025-12-04T10:34:11.3035886Z Size: 64(0x40) KB 2025-12-04T10:34:11.3036043Z Allocatable: FALSE 2025-12-04T10:34:11.3036204Z Alloc Granule: 0KB 2025-12-04T10:34:11.3036379Z Alloc Recommended Granule:0KB 2025-12-04T10:34:11.3036552Z Alloc Alignment: 0KB 2025-12-04T10:34:11.3036764Z Accessible by all: FALSE 2025-12-04T10:34:11.3036915Z ISA Info: 2025-12-04T10:34:11.3037031Z ISA 1 2025-12-04T10:34:11.3037176Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T10:34:11.3037358Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:34:11.3037535Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:34:11.3037715Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3037893Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3038060Z Fast f16: TRUE 2025-12-04T10:34:11.3038227Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3038390Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3038537Z x 1024(0x400) 2025-12-04T10:34:11.3038690Z y 1024(0x400) 2025-12-04T10:34:11.3038832Z z 1024(0x400) 2025-12-04T10:34:11.3038986Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3039140Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3039276Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3039428Z y 65535(0xffff) 2025-12-04T10:34:11.3039614Z z 65535(0xffff) 2025-12-04T10:34:11.3039771Z FBarrier Max Size: 32 2025-12-04T10:34:11.3039920Z ISA 2 2025-12-04T10:34:11.3040074Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T10:34:11.3040263Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:34:11.3040441Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:34:11.3040615Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3040796Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3040962Z Fast f16: TRUE 2025-12-04T10:34:11.3041131Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3041329Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3041469Z x 1024(0x400) 2025-12-04T10:34:11.3041609Z y 1024(0x400) 2025-12-04T10:34:11.3041748Z z 1024(0x400) 2025-12-04T10:34:11.3041901Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3042052Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3042187Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3042330Z y 65535(0xffff) 2025-12-04T10:34:11.3042471Z z 65535(0xffff) 2025-12-04T10:34:11.3042628Z FBarrier Max Size: 32 2025-12-04T10:34:11.3042775Z ******* 2025-12-04T10:34:11.3042888Z Agent 4 2025-12-04T10:34:11.3042999Z ******* 2025-12-04T10:34:11.3043133Z Name: gfx942 2025-12-04T10:34:11.3043293Z Uuid: GPU-e2954cd4b2ef3669 2025-12-04T10:34:11.3043458Z Marketing Name: 2025-12-04T10:34:11.3043626Z Vendor Name: AMD 2025-12-04T10:34:11.3043793Z Feature: KERNEL_DISPATCH 2025-12-04T10:34:11.3043959Z Profile: BASE_PROFILE 2025-12-04T10:34:11.3044167Z Float Round Mode: NEAR 2025-12-04T10:34:11.3044334Z Max Queue Number: 128(0x80) 2025-12-04T10:34:11.3044498Z Queue Min Size: 64(0x40) 2025-12-04T10:34:11.3044657Z Queue Max Size: 131072(0x20000) 2025-12-04T10:34:11.3044811Z Queue Type: MULTI 2025-12-04T10:34:11.3044971Z Node: 3 2025-12-04T10:34:11.3045119Z Device Type: GPU 2025-12-04T10:34:11.3045261Z Cache Info: 2025-12-04T10:34:11.3045386Z L1: 32(0x20) KB 2025-12-04T10:34:11.3045526Z L2: 4096(0x1000) KB 2025-12-04T10:34:11.3045664Z L3: 262144(0x40000) KB 2025-12-04T10:34:11.3045811Z Chip ID: 29861(0x74a5) 2025-12-04T10:34:11.3045972Z ASIC Revision: 1(0x1) 2025-12-04T10:34:11.3046133Z Cacheline Size: 128(0x80) 2025-12-04T10:34:11.3046297Z Max Clock Freq. (MHz): 2100 2025-12-04T10:34:11.3046449Z BDFID: 1280 2025-12-04T10:34:11.3046609Z Internal Node ID: 3 2025-12-04T10:34:11.3046765Z Compute Unit: 304 2025-12-04T10:34:11.3046922Z SIMDs per CU: 4 2025-12-04T10:34:11.3047079Z Shader Engines: 32 2025-12-04T10:34:11.3047242Z Shader Arrs. per Eng.: 1 2025-12-04T10:34:11.3047408Z WatchPts on Addr. Ranges:4 2025-12-04T10:34:11.3047580Z Coherent Host Access: FALSE 2025-12-04T10:34:11.3047729Z Memory Properties: 2025-12-04T10:34:11.3047854Z Features: KERNEL_DISPATCH 2025-12-04T10:34:11.3048004Z Fast F16 Operation: TRUE 2025-12-04T10:34:11.3048170Z Wavefront Size: 64(0x40) 2025-12-04T10:34:11.3048335Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3048527Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3048660Z x 1024(0x400) 2025-12-04T10:34:11.3048795Z y 1024(0x400) 2025-12-04T10:34:11.3048928Z z 1024(0x400) 2025-12-04T10:34:11.3049077Z Max Waves Per CU: 32(0x20) 2025-12-04T10:34:11.3049240Z Max Work-item Per CU: 2048(0x800) 2025-12-04T10:34:11.3049407Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3049554Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3049719Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3049856Z y 65535(0xffff) 2025-12-04T10:34:11.3049992Z z 65535(0xffff) 2025-12-04T10:34:11.3050151Z Max fbarriers/Workgrp: 32 2025-12-04T10:34:11.3050321Z Packet Processor uCode:: 185 2025-12-04T10:34:11.3050482Z SDMA engine uCode:: 24 2025-12-04T10:34:11.3050636Z IOMMU Support:: None 2025-12-04T10:34:11.3050768Z Pool Info: 2025-12-04T10:34:11.3050871Z Pool 1 2025-12-04T10:34:11.3051003Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:34:11.3051198Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3051358Z Allocatable: TRUE 2025-12-04T10:34:11.3051526Z Alloc Granule: 4KB 2025-12-04T10:34:11.3051697Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3051868Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3052043Z Accessible by all: FALSE 2025-12-04T10:34:11.3052190Z Pool 2 2025-12-04T10:34:11.3052330Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:34:11.3052488Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3052644Z Allocatable: TRUE 2025-12-04T10:34:11.3052810Z Alloc Granule: 4KB 2025-12-04T10:34:11.3052986Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3053157Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3053325Z Accessible by all: FALSE 2025-12-04T10:34:11.3053469Z Pool 3 2025-12-04T10:34:11.3053606Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:34:11.3053764Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3053919Z Allocatable: TRUE 2025-12-04T10:34:11.3054076Z Alloc Granule: 4KB 2025-12-04T10:34:11.3054239Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3054401Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3054563Z Accessible by all: FALSE 2025-12-04T10:34:11.3054710Z Pool 4 2025-12-04T10:34:11.3054837Z Segment: GROUP 2025-12-04T10:34:11.3054982Z Size: 64(0x40) KB 2025-12-04T10:34:11.3055127Z Allocatable: FALSE 2025-12-04T10:34:11.3055285Z Alloc Granule: 0KB 2025-12-04T10:34:11.3055497Z Alloc Recommended Granule:0KB 2025-12-04T10:34:11.3055658Z Alloc Alignment: 0KB 2025-12-04T10:34:11.3055816Z Accessible by all: FALSE 2025-12-04T10:34:11.3055955Z ISA Info: 2025-12-04T10:34:11.3056060Z ISA 1 2025-12-04T10:34:11.3056193Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T10:34:11.3056357Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:34:11.3056525Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:34:11.3056687Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3056854Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3057010Z Fast f16: TRUE 2025-12-04T10:34:11.3057163Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3057312Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3057440Z x 1024(0x400) 2025-12-04T10:34:11.3057571Z y 1024(0x400) 2025-12-04T10:34:11.3057701Z z 1024(0x400) 2025-12-04T10:34:11.3057841Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3057980Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3058134Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3058266Z y 65535(0xffff) 2025-12-04T10:34:11.3058395Z z 65535(0xffff) 2025-12-04T10:34:11.3058539Z FBarrier Max Size: 32 2025-12-04T10:34:11.3058677Z ISA 2 2025-12-04T10:34:11.3058823Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T10:34:11.3058998Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:34:11.3059161Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:34:11.3059319Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3059483Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3059689Z Fast f16: TRUE 2025-12-04T10:34:11.3059847Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3059992Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3060118Z x 1024(0x400) 2025-12-04T10:34:11.3060246Z y 1024(0x400) 2025-12-04T10:34:11.3060375Z z 1024(0x400) 2025-12-04T10:34:11.3060520Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3060659Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3060781Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3060913Z y 65535(0xffff) 2025-12-04T10:34:11.3061042Z z 65535(0xffff) 2025-12-04T10:34:11.3061189Z FBarrier Max Size: 32 2025-12-04T10:34:11.3061329Z ******* 2025-12-04T10:34:11.3061428Z Agent 5 2025-12-04T10:34:11.3061527Z ******* 2025-12-04T10:34:11.3061642Z Name: gfx942 2025-12-04T10:34:11.3061789Z Uuid: GPU-d34a48edc983a6e7 2025-12-04T10:34:11.3061943Z Marketing Name: 2025-12-04T10:34:11.3062096Z Vendor Name: AMD 2025-12-04T10:34:11.3062287Z Feature: KERNEL_DISPATCH 2025-12-04T10:34:11.3062442Z Profile: BASE_PROFILE 2025-12-04T10:34:11.3062599Z Float Round Mode: NEAR 2025-12-04T10:34:11.3062756Z Max Queue Number: 128(0x80) 2025-12-04T10:34:11.3062910Z Queue Min Size: 64(0x40) 2025-12-04T10:34:11.3063059Z Queue Max Size: 131072(0x20000) 2025-12-04T10:34:11.3063208Z Queue Type: MULTI 2025-12-04T10:34:11.3063347Z Node: 4 2025-12-04T10:34:11.3063492Z Device Type: GPU 2025-12-04T10:34:11.3063626Z Cache Info: 2025-12-04T10:34:11.3063742Z L1: 32(0x20) KB 2025-12-04T10:34:11.3063877Z L2: 4096(0x1000) KB 2025-12-04T10:34:11.3064009Z L3: 262144(0x40000) KB 2025-12-04T10:34:11.3064146Z Chip ID: 29861(0x74a5) 2025-12-04T10:34:11.3064294Z ASIC Revision: 1(0x1) 2025-12-04T10:34:11.3064447Z Cacheline Size: 128(0x80) 2025-12-04T10:34:11.3064601Z Max Clock Freq. (MHz): 2100 2025-12-04T10:34:11.3064781Z BDFID: 25856 2025-12-04T10:34:11.3064929Z Internal Node ID: 4 2025-12-04T10:34:11.3065081Z Compute Unit: 304 2025-12-04T10:34:11.3065230Z SIMDs per CU: 4 2025-12-04T10:34:11.3065383Z Shader Engines: 32 2025-12-04T10:34:11.3065543Z Shader Arrs. per Eng.: 1 2025-12-04T10:34:11.3065705Z WatchPts on Addr. Ranges:4 2025-12-04T10:34:11.3065866Z Coherent Host Access: FALSE 2025-12-04T10:34:11.3066006Z Memory Properties: 2025-12-04T10:34:11.3066123Z Features: KERNEL_DISPATCH 2025-12-04T10:34:11.3066266Z Fast F16 Operation: TRUE 2025-12-04T10:34:11.3066422Z Wavefront Size: 64(0x40) 2025-12-04T10:34:11.3066586Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3066732Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3066856Z x 1024(0x400) 2025-12-04T10:34:11.3066986Z y 1024(0x400) 2025-12-04T10:34:11.3067113Z z 1024(0x400) 2025-12-04T10:34:11.3067256Z Max Waves Per CU: 32(0x20) 2025-12-04T10:34:11.3067414Z Max Work-item Per CU: 2048(0x800) 2025-12-04T10:34:11.3067574Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3067709Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3067824Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3067954Z y 65535(0xffff) 2025-12-04T10:34:11.3068088Z z 65535(0xffff) 2025-12-04T10:34:11.3068234Z Max fbarriers/Workgrp: 32 2025-12-04T10:34:11.3068399Z Packet Processor uCode:: 185 2025-12-04T10:34:11.3068562Z SDMA engine uCode:: 24 2025-12-04T10:34:11.3068718Z IOMMU Support:: None 2025-12-04T10:34:11.3068852Z Pool Info: 2025-12-04T10:34:11.3069046Z Pool 1 2025-12-04T10:34:11.3069182Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:34:11.3069335Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3069486Z Allocatable: TRUE 2025-12-04T10:34:11.3069696Z Alloc Granule: 4KB 2025-12-04T10:34:11.3069861Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3070028Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3070189Z Accessible by all: FALSE 2025-12-04T10:34:11.3070329Z Pool 2 2025-12-04T10:34:11.3070462Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:34:11.3070610Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3070763Z Allocatable: TRUE 2025-12-04T10:34:11.3070919Z Alloc Granule: 4KB 2025-12-04T10:34:11.3071083Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3071246Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3071401Z Accessible by all: FALSE 2025-12-04T10:34:11.3071537Z Pool 3 2025-12-04T10:34:11.3071703Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:34:11.3071848Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3071995Z Allocatable: TRUE 2025-12-04T10:34:11.3072149Z Alloc Granule: 4KB 2025-12-04T10:34:11.3072313Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3072480Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3072638Z Accessible by all: FALSE 2025-12-04T10:34:11.3072774Z Pool 4 2025-12-04T10:34:11.3072898Z Segment: GROUP 2025-12-04T10:34:11.3073039Z Size: 64(0x40) KB 2025-12-04T10:34:11.3073185Z Allocatable: FALSE 2025-12-04T10:34:11.3073346Z Alloc Granule: 0KB 2025-12-04T10:34:11.3073507Z Alloc Recommended Granule:0KB 2025-12-04T10:34:11.3073667Z Alloc Alignment: 0KB 2025-12-04T10:34:11.3073827Z Accessible by all: FALSE 2025-12-04T10:34:11.3073966Z ISA Info: 2025-12-04T10:34:11.3074069Z ISA 1 2025-12-04T10:34:11.3074202Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T10:34:11.3074365Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:34:11.3074528Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:34:11.3074689Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3074852Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3075002Z Fast f16: TRUE 2025-12-04T10:34:11.3075161Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3075305Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3075436Z x 1024(0x400) 2025-12-04T10:34:11.3075565Z y 1024(0x400) 2025-12-04T10:34:11.3075694Z z 1024(0x400) 2025-12-04T10:34:11.3075874Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3076014Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3076137Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3076267Z y 65535(0xffff) 2025-12-04T10:34:11.3076394Z z 65535(0xffff) 2025-12-04T10:34:11.3076538Z FBarrier Max Size: 32 2025-12-04T10:34:11.3076676Z ISA 2 2025-12-04T10:34:11.3076816Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T10:34:11.3076991Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:34:11.3077152Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:34:11.3077312Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3077480Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3077633Z Fast f16: TRUE 2025-12-04T10:34:11.3077784Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3077927Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3078053Z x 1024(0x400) 2025-12-04T10:34:11.3078181Z y 1024(0x400) 2025-12-04T10:34:11.3078346Z z 1024(0x400) 2025-12-04T10:34:11.3078486Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3078623Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3078742Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3078870Z y 65535(0xffff) 2025-12-04T10:34:11.3079002Z z 65535(0xffff) 2025-12-04T10:34:11.3079152Z FBarrier Max Size: 32 2025-12-04T10:34:11.3079292Z ******* 2025-12-04T10:34:11.3079400Z Agent 6 2025-12-04T10:34:11.3079505Z ******* 2025-12-04T10:34:11.3079669Z Name: gfx942 2025-12-04T10:34:11.3079820Z Uuid: GPU-f24a9834b47f1628 2025-12-04T10:34:11.3079980Z Marketing Name: 2025-12-04T10:34:11.3080139Z Vendor Name: AMD 2025-12-04T10:34:11.3080294Z Feature: KERNEL_DISPATCH 2025-12-04T10:34:11.3080451Z Profile: BASE_PROFILE 2025-12-04T10:34:11.3080618Z Float Round Mode: NEAR 2025-12-04T10:34:11.3080773Z Max Queue Number: 128(0x80) 2025-12-04T10:34:11.3080927Z Queue Min Size: 64(0x40) 2025-12-04T10:34:11.3081078Z Queue Max Size: 131072(0x20000) 2025-12-04T10:34:11.3081228Z Queue Type: MULTI 2025-12-04T10:34:11.3081370Z Node: 5 2025-12-04T10:34:11.3081513Z Device Type: GPU 2025-12-04T10:34:11.3081644Z Cache Info: 2025-12-04T10:34:11.3081764Z L1: 32(0x20) KB 2025-12-04T10:34:11.3081895Z L2: 4096(0x1000) KB 2025-12-04T10:34:11.3082026Z L3: 262144(0x40000) KB 2025-12-04T10:34:11.3082158Z Chip ID: 29861(0x74a5) 2025-12-04T10:34:11.3082305Z ASIC Revision: 1(0x1) 2025-12-04T10:34:11.3082496Z Cacheline Size: 128(0x80) 2025-12-04T10:34:11.3082651Z Max Clock Freq. (MHz): 2100 2025-12-04T10:34:11.3082796Z BDFID: 5376 2025-12-04T10:34:11.3082944Z Internal Node ID: 5 2025-12-04T10:34:11.3083098Z Compute Unit: 304 2025-12-04T10:34:11.3083248Z SIMDs per CU: 4 2025-12-04T10:34:11.3083399Z Shader Engines: 32 2025-12-04T10:34:11.3083557Z Shader Arrs. per Eng.: 1 2025-12-04T10:34:11.3083723Z WatchPts on Addr. Ranges:4 2025-12-04T10:34:11.3083887Z Coherent Host Access: FALSE 2025-12-04T10:34:11.3084025Z Memory Properties: 2025-12-04T10:34:11.3084146Z Features: KERNEL_DISPATCH 2025-12-04T10:34:11.3084298Z Fast F16 Operation: TRUE 2025-12-04T10:34:11.3084458Z Wavefront Size: 64(0x40) 2025-12-04T10:34:11.3084617Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3084767Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3084894Z x 1024(0x400) 2025-12-04T10:34:11.3085027Z y 1024(0x400) 2025-12-04T10:34:11.3085194Z z 1024(0x400) 2025-12-04T10:34:11.3085338Z Max Waves Per CU: 32(0x20) 2025-12-04T10:34:11.3085495Z Max Work-item Per CU: 2048(0x800) 2025-12-04T10:34:11.3085653Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3085794Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3085921Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3086055Z y 65535(0xffff) 2025-12-04T10:34:11.3086183Z z 65535(0xffff) 2025-12-04T10:34:11.3086340Z Max fbarriers/Workgrp: 32 2025-12-04T10:34:11.3086507Z Packet Processor uCode:: 185 2025-12-04T10:34:11.3086667Z SDMA engine uCode:: 24 2025-12-04T10:34:11.3086837Z IOMMU Support:: None 2025-12-04T10:34:11.3086976Z Pool Info: 2025-12-04T10:34:11.3087085Z Pool 1 2025-12-04T10:34:11.3087218Z Segment: GLOBAL; FLAGS: COARSE GRAINED 2025-12-04T10:34:11.3087374Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3087529Z Allocatable: TRUE 2025-12-04T10:34:11.3087689Z Alloc Granule: 4KB 2025-12-04T10:34:11.3087854Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3088018Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3088180Z Accessible by all: FALSE 2025-12-04T10:34:11.3088316Z Pool 2 2025-12-04T10:34:11.3088446Z Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED 2025-12-04T10:34:11.3088600Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3088752Z Allocatable: TRUE 2025-12-04T10:34:11.3088908Z Alloc Granule: 4KB 2025-12-04T10:34:11.3089075Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3089240Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3089434Z Accessible by all: FALSE 2025-12-04T10:34:11.3089617Z Pool 3 2025-12-04T10:34:11.3089748Z Segment: GLOBAL; FLAGS: FINE GRAINED 2025-12-04T10:34:11.3089894Z Size: 268419072(0xfffc000) KB 2025-12-04T10:34:11.3090043Z Allocatable: TRUE 2025-12-04T10:34:11.3090198Z Alloc Granule: 4KB 2025-12-04T10:34:11.3090363Z Alloc Recommended Granule:2048KB 2025-12-04T10:34:11.3090526Z Alloc Alignment: 4KB 2025-12-04T10:34:11.3090684Z Accessible by all: FALSE 2025-12-04T10:34:11.3090819Z Pool 4 2025-12-04T10:34:11.3090942Z Segment: GROUP 2025-12-04T10:34:11.3091083Z Size: 64(0x40) KB 2025-12-04T10:34:11.3091235Z Allocatable: FALSE 2025-12-04T10:34:11.3091391Z Alloc Granule: 0KB 2025-12-04T10:34:11.3091554Z Alloc Recommended Granule:0KB 2025-12-04T10:34:11.3091717Z Alloc Alignment: 0KB 2025-12-04T10:34:11.3091876Z Accessible by all: FALSE 2025-12-04T10:34:11.3092058Z ISA Info: 2025-12-04T10:34:11.3092166Z ISA 1 2025-12-04T10:34:11.3092300Z Name: amdgcn-amd-amdhsa--gfx942:sramecc+:xnack- 2025-12-04T10:34:11.3092469Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:34:11.3092630Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:34:11.3092798Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3092972Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3093128Z Fast f16: TRUE 2025-12-04T10:34:11.3093282Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3093428Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3093557Z x 1024(0x400) 2025-12-04T10:34:11.3093693Z y 1024(0x400) 2025-12-04T10:34:11.3093829Z z 1024(0x400) 2025-12-04T10:34:11.3093971Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3094108Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3094228Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3094357Z y 65535(0xffff) 2025-12-04T10:34:11.3094487Z z 65535(0xffff) 2025-12-04T10:34:11.3094631Z FBarrier Max Size: 32 2025-12-04T10:34:11.3094770Z ISA 2 2025-12-04T10:34:11.3094909Z Name: amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack- 2025-12-04T10:34:11.3095089Z Machine Models: HSA_MACHINE_MODEL_LARGE 2025-12-04T10:34:11.3095251Z Profiles: HSA_PROFILE_BASE 2025-12-04T10:34:11.3095418Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3095584Z Default Rounding Mode: NEAR 2025-12-04T10:34:11.3095738Z Fast f16: TRUE 2025-12-04T10:34:11.3095893Z Workgroup Max Size: 1024(0x400) 2025-12-04T10:34:11.3096038Z Workgroup Max Size per Dimension: 2025-12-04T10:34:11.3096206Z x 1024(0x400) 2025-12-04T10:34:11.3096337Z y 1024(0x400) 2025-12-04T10:34:11.3096466Z z 1024(0x400) 2025-12-04T10:34:11.3096609Z Grid Max Size: 4294967295(0xffffffff) 2025-12-04T10:34:11.3096751Z Grid Max Size per Dimension: 2025-12-04T10:34:11.3096874Z x 2147483647(0x7fffffff) 2025-12-04T10:34:11.3097006Z y 65535(0xffff) 2025-12-04T10:34:11.3097134Z z 65535(0xffff) 2025-12-04T10:34:11.3097279Z FBarrier Max Size: 32 2025-12-04T10:34:11.3097415Z *** Done *** 2025-12-04T10:34:11.3097524Z + rocminfo 2025-12-04T10:34:11.3097620Z + grep -E 'Name:.*\sgfx|Marketing' 2025-12-04T10:34:11.3949624Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:34:11.3949828Z Marketing Name: AMD EPYC 9575F 64-Core Processor 2025-12-04T10:34:11.3950008Z Name: gfx942 2025-12-04T10:34:11.3950160Z Marketing Name: 2025-12-04T10:34:11.3950307Z Name: gfx942 2025-12-04T10:34:11.3950451Z Marketing Name: 2025-12-04T10:34:11.3950596Z Name: gfx942 2025-12-04T10:34:11.3950842Z Marketing Name: 2025-12-04T10:34:11.3950988Z Name: gfx942 2025-12-04T10:34:11.3951133Z Marketing Name: 2025-12-04T10:34:11.4036433Z + MAYBE_ROCM=rocm/ 2025-12-04T10:34:11.4036633Z + [[ linux-jammy-rocm-py3.10 == *xpu* ]] 2025-12-04T10:34:11.4036795Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-12-04T10:34:11.4036940Z + pip_install ninja==1.10.2 2025-12-04T10:34:11.4037095Z + pip_install_pkg='python3 -m pip install --progress-bar off' 2025-12-04T10:34:11.4037277Z + python3 -m pip install --progress-bar off ninja==1.10.2 2025-12-04T10:34:11.5962602Z Collecting ninja==1.10.2 2025-12-04T10:34:11.6221728Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB) 2025-12-04T10:34:11.6311129Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB) 2025-12-04T10:34:11.7987479Z Installing collected packages: ninja 2025-12-04T10:34:11.7987915Z Attempting uninstall: ninja 2025-12-04T10:34:11.7994457Z Found existing installation: ninja 1.11.1.4 2025-12-04T10:34:11.8010674Z Uninstalling ninja-1.11.1.4: 2025-12-04T10:34:11.8049310Z Successfully uninstalled ninja-1.11.1.4 2025-12-04T10:34:11.8165221Z Successfully installed ninja-1.10.2 2025-12-04T10:34:11.8605629Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T10:34:11.8607316Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.10/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin 2025-12-04T10:34:11.8608192Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-12-04T10:34:11.8608485Z + [[ linux-jammy-rocm-py3.10 == *asan* ]] 2025-12-04T10:34:11.8608773Z + [[ linux-jammy-rocm-py3.10 == *-debug* ]] 2025-12-04T10:34:11.8609057Z + [[ linux-jammy-rocm-py3.10 != *-bazel-* ]] 2025-12-04T10:34:11.8609458Z + echo 'We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass' 2025-12-04T10:34:11.8610040Z We are not in debug mode: linux-jammy-rocm-py3.10. Expect the assertion to pass 2025-12-04T10:34:11.8612988Z + cd test 2025-12-04T10:34:11.8613328Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)' 2025-12-04T10:34:12.7334727Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]] 2025-12-04T10:34:12.7335048Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]] 2025-12-04T10:34:12.7335334Z + [[ distributed == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]] 2025-12-04T10:34:12.7340232Z + DYNAMO_BENCHMARK_FLAGS=() 2025-12-04T10:34:12.7340786Z + [[ distributed == *pr_time_benchmarks* ]] 2025-12-04T10:34:12.7341053Z + [[ distributed == *dynamo_eager* ]] 2025-12-04T10:34:12.7341301Z + [[ distributed == *aot_eager* ]] 2025-12-04T10:34:12.7341522Z + [[ distributed == *aot_inductor* ]] 2025-12-04T10:34:12.7341783Z + [[ distributed == *max_autotune_inductor* ]] 2025-12-04T10:34:12.7342016Z + [[ distributed == *inductor* ]] 2025-12-04T10:34:12.7342231Z + [[ distributed == *dynamic* ]] 2025-12-04T10:34:12.7342439Z + [[ distributed == *cpu* ]] 2025-12-04T10:34:12.7342641Z + [[ distributed == *xpu* ]] 2025-12-04T10:34:12.7342885Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda) 2025-12-04T10:34:12.7359283Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-12-04T10:34:12.7359502Z + [[ linux-jammy-rocm-py3.10 == *-bazel-* ]] 2025-12-04T10:34:12.7366308Z + cd test 2025-12-04T10:34:12.7366510Z + python -c 'import torch; print(torch.__config__.show())' 2025-12-04T10:34:13.4577963Z PyTorch built with: 2025-12-04T10:34:13.4578141Z - GCC 11.4 2025-12-04T10:34:13.4578249Z - C++ Version: 201703 2025-12-04T10:34:13.4579006Z - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T10:34:13.4579281Z - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T10:34:13.4579453Z - OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T10:34:13.4579731Z - LAPACK is enabled (usually provided by MKL) 2025-12-04T10:34:13.4579866Z - NNPACK is enabled 2025-12-04T10:34:13.4579982Z - CPU capability usage: AVX512 2025-12-04T10:34:13.4580113Z - HIP Runtime 7.1.25424 2025-12-04T10:34:13.4580222Z - MIOpen 3.5.1 2025-12-04T10:34:13.4580317Z - Magma 2.9.0 2025-12-04T10:34:13.4581944Z - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=35b7a9a26c5923d98aebaa41a031dae21788a9ee, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_FBGEMM_GENAI -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.10.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 2025-12-04T10:34:13.4583603Z 2025-12-04T10:34:13.6670003Z + cd test 2025-12-04T10:34:13.6670525Z + python -c 'import torch; print(torch.__config__.parallel_info())' 2025-12-04T10:34:14.3186471Z ATen/Parallel: 2025-12-04T10:34:14.3186944Z at::get_num_threads() : 128 2025-12-04T10:34:14.3187308Z at::get_num_interop_threads() : 128 2025-12-04T10:34:14.3187700Z OpenMP 201511 (a.k.a. OpenMP 4.5) 2025-12-04T10:34:14.3188029Z omp_get_max_threads() : 128 2025-12-04T10:34:14.3188636Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications 2025-12-04T10:34:14.3189239Z mkl_get_max_threads() : 128 2025-12-04T10:34:14.3189855Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d) 2025-12-04T10:34:14.3191030Z std::thread::hardware_concurrency() : 128 2025-12-04T10:34:14.3191371Z Environment variables: 2025-12-04T10:34:14.3191661Z OMP_NUM_THREADS : [not set] 2025-12-04T10:34:14.3191951Z MKL_NUM_THREADS : [not set] 2025-12-04T10:34:14.3192254Z ATen parallel backend: OpenMP 2025-12-04T10:34:14.3192457Z 2025-12-04T10:34:14.5464366Z + [[ distributed == *numpy_2* ]] 2025-12-04T10:34:14.5464637Z + [[ linux-jammy-rocm-py3.10 == *aarch64* ]] 2025-12-04T10:34:14.5464847Z + [[ distributed == *backward* ]] 2025-12-04T10:34:14.5465085Z + [[ distributed == *libtorch_agnostic_targetting* ]] 2025-12-04T10:34:14.5465299Z + [[ distributed == *xla* ]] 2025-12-04T10:34:14.5465469Z + [[ distributed == *vllm* ]] 2025-12-04T10:34:14.5465633Z + [[ distributed == *executorch* ]] 2025-12-04T10:34:14.5465818Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]] 2025-12-04T10:34:14.5466013Z + [[ distributed == \q\u\a\n\t\i\z\a\t\i\o\n ]] 2025-12-04T10:34:14.5466210Z + [[ linux-jammy-rocm-py3.10 == *libtorch* ]] 2025-12-04T10:34:14.5466412Z + [[ distributed == distributed ]] 2025-12-04T10:34:14.5466583Z + test_distributed 2025-12-04T10:34:14.5466752Z + echo 'Testing distributed python tests' 2025-12-04T10:34:14.5466943Z Testing distributed python tests 2025-12-04T10:34:14.5467185Z + python test/run_test.py --distributed-tests --shard 2 3 --verbose 2025-12-04T10:34:16.2378123Z Excluding distributed/rpc/test_faulty_agent on ROCm 2025-12-04T10:34:16.2378683Z Excluding distributed/rpc/test_tensorpipe_agent on ROCm 2025-12-04T10:34:16.2379788Z Excluding distributed/rpc/test_share_memory on ROCm 2025-12-04T10:34:16.2380273Z Excluding distributed/rpc/cuda/test_tensorpipe_agent on ROCm 2025-12-04T10:34:17.2573698Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json 2025-12-04T10:34:17.6176325Z Ignoring disabled issues: [''] 2025-12-04T10:34:17.6224093Z Found test times from artifacts 2025-12-04T10:34:17.6385735Z Found test times from artifacts 2025-12-04T10:34:17.6390576Z Running all tests 2025-12-04T10:34:17.6460001Z Running parallel tests on 1 processes 2025-12-04T10:34:17.6461770Z Name: tests to run (est. time: 161.22min) 2025-12-04T10:34:17.6462154Z Serial tests (74): 2025-12-04T10:34:17.6462420Z distributed/test_inductor_collectives 2/2 2025-12-04T10:34:17.6462743Z distributed/_tools/test_fake_collectives 1/1 2025-12-04T10:34:17.6463056Z distributed/test_control_collectives 1/1 2025-12-04T10:34:17.6463384Z distributed/test_collective_utils 1/1 2025-12-04T10:34:17.6463692Z distributed/test_c10d_object_collectives 1/1 2025-12-04T10:34:17.6463975Z distributed/algorithms/test_join 1/1 2025-12-04T10:34:17.6464267Z distributed/tensor/test_dtensor_compile 2/4 2025-12-04T10:34:17.6464594Z distributed/pipelining/test_schedule_multiproc 1/1 2025-12-04T10:34:17.6464913Z distributed/pipelining/test_pipe 1/1 2025-12-04T10:34:17.6465199Z distributed/test_compute_comm_reordering 1/1 2025-12-04T10:34:17.6465495Z distributed/tensor/test_dtensor 3/3 2025-12-04T10:34:17.6465799Z distributed/test_aten_comm_compute_reordering 3/3 2025-12-04T10:34:17.6466111Z distributed/tensor/test_redistribute 2/2 2025-12-04T10:34:17.6466395Z distributed/tensor/test_tensor_ops 3/4 2025-12-04T10:34:17.6466667Z distributed/test_device_mesh 1/2 2025-12-04T10:34:17.6466945Z distributed/tensor/test_convolution_ops 1/1 2025-12-04T10:34:17.6467253Z distributed/tensor/parallel/test_tp_style 1/1 2025-12-04T10:34:17.6467546Z distributed/test_debug 1/1 2025-12-04T10:34:17.6467810Z distributed/test_overlap_bucketing_unit 1/1 2025-12-04T10:34:17.6468173Z distributed/checkpoint/_experimental/test_checkpoint_writer 1/1 2025-12-04T10:34:17.6468545Z distributed/optim/test_named_optimizer 1/1 2025-12-04T10:34:17.6468888Z distributed/checkpoint/_experimental/test_checkpointer 1/1 2025-12-04T10:34:17.6469225Z distributed/tensor/test_api 1/1 2025-12-04T10:34:17.6469485Z distributed/tensor/test_init 1/1 2025-12-04T10:34:17.6470418Z distributed/checkpoint/e2e/test_fine_tuning 1/1 2025-12-04T10:34:17.6470724Z distributed/tensor/test_matrix_ops 1/1 2025-12-04T10:34:17.6471003Z distributed/pipelining/test_stage 1/1 2025-12-04T10:34:17.6471324Z distributed/tensor/parallel/test_tp_random_state 1/1 2025-12-04T10:34:17.6471639Z distributed/checkpoint/test_planner 1/1 2025-12-04T10:34:17.6471949Z distributed/checkpoint/test_dtensor_checkpoint 1/1 2025-12-04T10:34:17.6472265Z distributed/pipelining/test_schedule 1/1 2025-12-04T10:34:17.6472604Z distributed/_composable/fsdp/test_fully_shard_overlap 1/1 2025-12-04T10:34:17.6472921Z distributed/test_run 1/1 2025-12-04T10:34:17.6473167Z distributed/tensor/test_math_ops 1/1 2025-12-04T10:34:17.6473449Z distributed/tensor/test_pointwise_ops 1/1 2025-12-04T10:34:17.6473746Z distributed/checkpoint/test_compatibility 1/1 2025-12-04T10:34:17.6473961Z distributed/_tools/test_mem_tracker 1/1 2025-12-04T10:34:17.6474174Z distributed/elastic/test_control_plane 1/1 2025-12-04T10:34:17.6474378Z distributed/fsdp/test_fsdp_overlap 1/1 2025-12-04T10:34:17.6474576Z distributed/test_functional_api 1/1 2025-12-04T10:34:17.6474840Z distributed/_composable/test_composability/test_2d_composability 1/1 2025-12-04T10:34:17.6475109Z distributed/fsdp/test_fsdp_optim_state 1/1 2025-12-04T10:34:17.6475310Z distributed/tensor/test_view_ops 1/1 2025-12-04T10:34:17.6475514Z distributed/fsdp/test_fsdp_state_dict 2/2 2025-12-04T10:34:17.6475863Z distributed/fsdp/test_fsdp_exec_order 1/1 2025-12-04T10:34:17.6476067Z distributed/test_distributed_spawn 2/7 2025-12-04T10:34:17.6476272Z distributed/test_distributed_spawn 5/7 2025-12-04T10:34:17.6476471Z distributed/fsdp/test_fsdp_input 1/1 2025-12-04T10:34:17.6476672Z distributed/fsdp/test_fsdp_traversal 1/1 2025-12-04T10:34:17.6476890Z distributed/fsdp/test_fsdp_ignored_modules 1/1 2025-12-04T10:34:17.6477107Z distributed/fsdp/test_checkpoint_wrapper 1/1 2025-12-04T10:34:17.6477323Z distributed/fsdp/test_fsdp_checkpoint 1/1 2025-12-04T10:34:17.6477528Z distributed/fsdp/test_fsdp_fine_tune 1/1 2025-12-04T10:34:17.6477729Z distributed/test_multi_threaded_pg 1/1 2025-12-04T10:34:17.6477974Z distributed/_composable/fsdp/test_fully_shard_extensions 1/1 2025-12-04T10:34:17.6478255Z distributed/checkpoint/test_file_system_checkpoint_cpu 1/1 2025-12-04T10:34:17.6478496Z distributed/fsdp/test_wrap 1/1 2025-12-04T10:34:17.6478711Z distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 2025-12-04T10:34:17.6478944Z distributed/fsdp/test_fsdp_hybrid_shard 1/1 2025-12-04T10:34:17.6479187Z distributed/_composable/fsdp/test_fully_shard_training 1/1 2025-12-04T10:34:17.6479439Z distributed/fsdp/test_fsdp_multiple_forward 1/1 2025-12-04T10:34:17.6479717Z distributed/checkpoint/test_state_dict 1/1 2025-12-04T10:34:17.6479920Z distributed/fsdp/test_fsdp_core 1/2 2025-12-04T10:34:17.6480108Z distributed/test_c10d_spawn_ucc 1/1 2025-12-04T10:34:17.6480296Z distributed/test_c10d_gloo 1/1 2025-12-04T10:34:17.6480487Z distributed/test_c10d_ops_nccl 1/1 2025-12-04T10:34:17.6480686Z distributed/elastic/events/lib_test 1/1 2025-12-04T10:34:17.6480889Z distributed/elastic/metrics/api_test 1/1 2025-12-04T10:34:17.6481110Z distributed/elastic/multiprocessing/api_test 1/1 2025-12-04T10:34:17.6481355Z distributed/elastic/timer/local_timer_example 1/1 2025-12-04T10:34:17.6481584Z distributed/elastic/timer/local_timer_test 1/1 2025-12-04T10:34:17.6481815Z distributed/elastic/utils/distributed_test 1/1 2025-12-04T10:34:17.6482028Z distributed/elastic/utils/logging_test 1/1 2025-12-04T10:34:17.6482233Z distributed/elastic/utils/util_test 1/1 2025-12-04T10:34:17.6482430Z Parallel tests (0): 2025-12-04T10:34:17.6482591Z Name: excluded (est. time: 0.0min) 2025-12-04T10:34:17.6482763Z Serial tests (0): 2025-12-04T10:34:17.6482941Z Parallel tests (0): 2025-12-04T10:34:17.6483299Z Running distributed/test_inductor_collectives 2/2 ... [2025-12-04 10:34:17.646370][5222498.625407181] 2025-12-04T10:34:17.6483623Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:34:17.6484131Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_inductor_collectives.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:34:17.646615] 2025-12-04T10:35:47.7124069Z 2025-12-04T10:35:47.7127743Z distributed/test_inductor_collectives 2/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_inductor_collectives_2.2_ff03bed5fd29f50f_.log 2025-12-04T10:35:47.7137589Z Running 28 items in this shard: test/distributed/test_inductor_collectives.py::TestCollectivesMultiProc::test_all_to_all_recompute_is_always_banned_override_with_ac_False, test/distributed/test_inductor_collectives.py::TestCollectivesMultiProc::test_all_to_all_single_inductor_split_sizes_none, test/distributed/test_inductor_collectives.py::TestCollectivesMultiProc::test_eager_allreduce_inductor_wait, test/distributed/test_inductor_collectives.py::TestCollectivesMultiProc::test_eager_async_allreduce_inductor_wait, test/distributed/test_inductor_collectives.py::TestCollectivesMultiProc::test_inductor_allreduce_eager_wait, test/distributed/test_inductor_collectives.py::TestCollectivesMultiProc::test_reduce_scatter_tensor_inductor, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_all_gather_bucket_bucket_mode_all, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_all_gather_bucket_bucket_mode_all_custom_ops, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_all_gather_bucket_multidtype_bucket_mode_all_custom_ops_multidtype, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_all_gather_bucket_path, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_all_reduce_bucket_bucket_mode_all, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_get_world_group_source_GroupMember_WORLD, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_pg_var, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_rewrite_dist_all_gather_list, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_rewrite_dist_allreduce_pg_mode_kwargs_none, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_rewrite_dist_allreduce_pg_mode_positional_none, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_rewrite_dist_allreduce_reduce_op_reduce_op0, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_rewrite_dist_allreduce_reduce_op_reduce_op3, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_support_collective_op_with_async_op_False, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_dynamo_trace_all_gather_tensor, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_inductor_all_gather_coalesced, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_inductor_doesnt_mutate_shared, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_inductor_doesnt_mutate_shared_graph_partition, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_inductor_single_op, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_inductor_steal_buffer, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_reduce_scatter_bucket_bucket_mode_all_custom_ops, test/distributed/test_inductor_collectives.py::TestCollectivesInductor::test_reorder_respects_wait_dep, test/distributed/test_inductor_collectives.py::TestSyncDecisionCrossRanks::test_all_reduce_comm_analysis 2025-12-04T10:35:47.7145298Z 2025-12-04T10:35:47.7145650Z Finished distributed/test_inductor_collectives 2/2 ... [2025-12-04 10:35:47.712386][5222588.691423506], took 1.50min 2025-12-04T10:35:47.7146309Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:35:48.9545659Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:35:48.9546292Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T10:35:48.9546777Z Uploading artifacts took 0.00 seconds 2025-12-04T10:35:48.9547294Z Running distributed/_tools/test_fake_collectives 1/1 ... [2025-12-04 10:35:48.954392][5222589.933425658] 2025-12-04T10:35:48.9547812Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:35:48.9548866Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tools/test_fake_collectives.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:35:48.954659] 2025-12-04T10:35:51.2729632Z 2025-12-04T10:35:51.2730842Z distributed/_tools/test_fake_collectives 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_fake_collectives_1.1_0fa9d9bee7702c92_.log 2025-12-04T10:35:51.2732154Z Running 1 items in this shard: test/distributed/_tools/test_fake_collectives.py::TestFakeCollectives::test_collectives 2025-12-04T10:35:51.2733198Z 2025-12-04T10:35:51.2733582Z Finished distributed/_tools/test_fake_collectives 1/1 ... [2025-12-04 10:35:51.272630][5222592.251666068], took 0.04min 2025-12-04T10:35:51.2734953Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:35:51.2750016Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:35:51.2752947Z Running distributed/test_control_collectives 1/1 ... [2025-12-04 10:35:51.275144][5222592.254185345] 2025-12-04T10:35:51.2753362Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:35:51.2754941Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_control_collectives.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:35:51.275346] 2025-12-04T10:35:53.4930807Z 2025-12-04T10:35:53.4931280Z distributed/test_control_collectives 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_control_collectives_1.1_39b5fa5d5139d686_.log 2025-12-04T10:35:53.4933763Z Running 13 items in this shard: test/distributed/test_control_collectives.py::TestCollectives::test_all_gather_timeout, test/distributed/test_control_collectives.py::TestCollectives::test_all_sum, test/distributed/test_control_collectives.py::TestCollectives::test_all_sum_timeout, test/distributed/test_control_collectives.py::TestCollectives::test_barrier, test/distributed/test_control_collectives.py::TestCollectives::test_barrier_timeout, test/distributed/test_control_collectives.py::TestCollectives::test_broadcast, test/distributed/test_control_collectives.py::TestCollectives::test_broadcast_timeout, test/distributed/test_control_collectives.py::TestCollectives::test_gather, test/distributed/test_control_collectives.py::TestCollectives::test_gather_timeout, test/distributed/test_control_collectives.py::TestCollectives::test_scatter, test/distributed/test_control_collectives.py::TestCollectives::test_scatter_timeout, test/distributed/test_control_collectives.py::TestCollectives::test_simple_user_func, test/distributed/test_control_collectives.py::TestCollectives::test_unique 2025-12-04T10:35:53.4935980Z 2025-12-04T10:35:53.4936192Z Finished distributed/test_control_collectives 1/1 ... [2025-12-04 10:35:53.492735][5222594.471772094], took 0.04min 2025-12-04T10:35:53.4937030Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:35:53.4951173Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:35:53.4951961Z Running distributed/test_collective_utils 1/1 ... [2025-12-04 10:35:53.495075][5222594.474116213] 2025-12-04T10:35:53.4952202Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:35:53.4953952Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_collective_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:35:53.495271] 2025-12-04T10:36:13.4893451Z 2025-12-04T10:36:13.4894747Z distributed/test_collective_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_collective_utils_1.1_fa7a4a2b2eb0275c_.log 2025-12-04T10:36:13.4898242Z Running 9 items in this shard: test/distributed/test_collective_utils.py::TestCollectiveUtils::test_all_gather_result, test/distributed/test_collective_utils.py::TestCollectiveUtils::test_all_gather_result_no_pg, test/distributed/test_collective_utils.py::TestCollectiveUtils::test_all_gather_result_raises_exceptions_from_func, test/distributed/test_collective_utils.py::TestCollectiveUtils::test_broadcast_result, test/distributed/test_collective_utils.py::TestCollectiveUtils::test_broadcast_result_no_pg, test/distributed/test_collective_utils.py::TestCollectiveUtils::test_broadcast_result_raises_exceptions_from_func, test/distributed/test_collective_utils.py::TestCollectiveUtils::test_check_rng_sync_device_cpu, test/distributed/test_collective_utils.py::TestCollectiveUtils::test_check_rng_sync_device_cuda, test/distributed/test_collective_utils.py::TestUtils::test_summarize_ranks 2025-12-04T10:36:13.4902009Z 2025-12-04T10:36:13.4902335Z Finished distributed/test_collective_utils 1/1 ... [2025-12-04 10:36:13.489094][5222614.468131509], took 0.33min 2025-12-04T10:36:13.4903405Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:36:13.4911673Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:36:13.4914019Z Running distributed/test_c10d_object_collectives 1/1 ... [2025-12-04 10:36:13.491314][5222614.47035563] 2025-12-04T10:36:13.4914394Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:36:13.4916437Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_object_collectives.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:36:13.491513] 2025-12-04T10:36:56.0218224Z 2025-12-04T10:36:56.0219827Z distributed/test_c10d_object_collectives 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_object_collectives_1.1_81d285d60c1553a0_.log 2025-12-04T10:36:56.0224318Z Running 9 items in this shard: test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_all_gather_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_broadcast_object_list, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_gather_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_scatter_object_list, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_send_recv_object_list, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_subpg_all_gather_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_subpg_broadcast_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_subpg_gather_object, test/distributed/test_c10d_object_collectives.py::TestObjectCollectives::test_subpg_scatter_object 2025-12-04T10:36:56.0226872Z 2025-12-04T10:36:56.0227154Z Finished distributed/test_c10d_object_collectives 1/1 ... [2025-12-04 10:36:56.021469][5222657.000506365], took 0.71min 2025-12-04T10:36:56.0228041Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:36:56.0236532Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:36:56.0239078Z Running distributed/algorithms/test_join 1/1 ... [2025-12-04 10:36:56.023814][5222657.002854594] 2025-12-04T10:36:56.0239377Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:36:56.0241472Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/algorithms/test_join.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:36:56.024022] 2025-12-04T10:37:39.4578594Z 2025-12-04T10:37:39.4582750Z distributed/algorithms/test_join 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.algorithms.test_join_1.1_964bfaa3d83ab92b_.log 2025-12-04T10:37:39.4586478Z Running 9 items in this shard: test/distributed/algorithms/test_join.py::TestJoin::test_join_kwargs, test/distributed/algorithms/test_join.py::TestJoin::test_multiple_joinable_disable, test/distributed/algorithms/test_join.py::TestJoin::test_multiple_joinables, test/distributed/algorithms/test_join.py::TestJoin::test_multiple_joinables_throw, test/distributed/algorithms/test_join.py::TestJoin::test_single_joinable, test/distributed/algorithms/test_join.py::TestJoin::test_single_joinable_disable, test/distributed/algorithms/test_join.py::TestJoin::test_single_joinable_main_hooks, test/distributed/algorithms/test_join.py::TestJoin::test_single_joinable_post_hooks, test/distributed/algorithms/test_join.py::TestJoin::test_single_joinable_throw 2025-12-04T10:37:39.4590027Z 2025-12-04T10:37:39.4590406Z Finished distributed/algorithms/test_join 1/1 ... [2025-12-04 10:37:39.457415][5222700.436452288], took 0.72min 2025-12-04T10:37:39.4591385Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:37:39.4593921Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:37:39.4595795Z Running distributed/tensor/test_dtensor_compile 2/4 ... [2025-12-04 10:37:39.459466][5222700.438507841] 2025-12-04T10:37:39.4596194Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:37:39.4598290Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_dtensor_compile.py', '--shard-id=2', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:37:39.459682] 2025-12-04T10:38:09.2253945Z 2025-12-04T10:38:09.2255268Z distributed/tensor/test_dtensor_compile 2/4 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_compile_2.4_d6f1cff278895a1e_.log 2025-12-04T10:38:09.2261283Z Running 12 items in this shard: test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_basic, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_dynamic_cat, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_dynamo_device_mesh_attrs, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dtensor_partial_placement_graph_output, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dynamo_dtensor, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dynamo_dtensor_from_local_redistribute, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dynamo_from_local_grad_placements_sequence_intermediate, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_dynamo_to_local_grad_placements_sequence_intermediate, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_fakify_dtensor, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_placement_compile, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompile::test_tp_compile_comm_reordering_graph_partition, test/distributed/tensor/test_dtensor_compile.py::TestDTensorCompileE2E::test_compile_dtensor_redistribute_backward_use_ca_True 2025-12-04T10:38:09.2265202Z 2025-12-04T10:38:09.2265484Z Finished distributed/tensor/test_dtensor_compile 2/4 ... [2025-12-04 10:38:09.225070][5222730.204108576], took 0.50min 2025-12-04T10:38:09.2266400Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:38:09.2267649Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:38:09.2270680Z Running distributed/pipelining/test_schedule_multiproc 1/1 ... [2025-12-04 10:38:09.226923][5222730.205964011] 2025-12-04T10:38:09.2271033Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:38:09.2272456Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/pipelining/test_schedule_multiproc.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:38:09.227121] 2025-12-04T10:38:30.4284191Z 2025-12-04T10:38:30.4285035Z distributed/pipelining/test_schedule_multiproc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_schedule_multiproc_1.1_f8b75c7df4461a65_.log 2025-12-04T10:38:30.4295172Z Running 34 items in this shard: test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_custom_function_callback, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_eval_inference_mode_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_eval_inference_mode_ScheduleClass1, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_eval_inference_mode_ScheduleClass2, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_eval_inference_mode_ScheduleClass3, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_eval_inference_mode_ScheduleClass4, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_forward_only_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_manual_ScheduleClass0_shape_inference_False, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_manual_ScheduleClass0_shape_inference_True, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_manual_ScheduleClass1_shape_inference_False, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_manual_ScheduleClass1_shape_inference_True, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_manual_interleaved_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_manual_interleaved_ScheduleClass1, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_manual_interleaved_ScheduleClass2, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_tracer_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_grad_with_tracer_ScheduleClass1, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_kwargs_with_tracer_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_kwargs_with_tracer_ScheduleClass1, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_multi_iter_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_multi_iter_ScheduleClass1, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_return_output_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_return_output_ScheduleClass1, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_return_output_ScheduleClass2, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_return_output_ScheduleClass3, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_return_output_ScheduleClass4, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_schedule_with_weight_update_mlp_e2e_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_v_shape_schedules_schedule_class0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_v_shape_schedules_schedule_class1, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_zero_bubble_with_model_kwargs_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::ScheduleTest::test_zero_bubble_with_model_kwargs_ScheduleClass1, test/distributed/pipelining/test_schedule_multiproc.py::CustomSchedulesTest::test_non_symmetric_stage_ids_schedule_class0, test/distributed/pipelining/test_schedule_multiproc.py::CustomSchedulesTest::test_non_symmetric_stage_ids_schedule_class1, test/distributed/pipelining/test_schedule_multiproc.py::CustomSchedulesTest::test_pipeline_schedule_runtime_custom_sched_ScheduleClass0, test/distributed/pipelining/test_schedule_multiproc.py::CustomSchedulesTest::test_schedule_with_native_zero_bubble_ScheduleClass0 2025-12-04T10:38:30.4302396Z 2025-12-04T10:38:30.4302597Z Finished distributed/pipelining/test_schedule_multiproc 1/1 ... [2025-12-04 10:38:30.428260][5222751.407298343], took 0.35min 2025-12-04T10:38:30.4303190Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:38:30.4303691Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:38:30.4303995Z Running distributed/pipelining/test_pipe 1/1 ... [2025-12-04 10:38:30.430251][5222751.409292877] 2025-12-04T10:38:30.4304250Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:38:30.4305608Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/pipelining/test_pipe.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:38:30.430435] 2025-12-04T10:38:33.4571465Z 2025-12-04T10:38:33.4572155Z distributed/pipelining/test_pipe 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_pipe_1.1_2be582dd3db3f15a_.log 2025-12-04T10:38:33.4572899Z Running 3 items in this shard: test/distributed/pipelining/test_pipe.py::PipeTests::test_model_split_ModelClass0, test/distributed/pipelining/test_pipe.py::PipeTests::test_model_split_ModelClass1, test/distributed/pipelining/test_pipe.py::PipeTests::test_model_split_ModelClass2 2025-12-04T10:38:33.4573330Z 2025-12-04T10:38:33.4573468Z Finished distributed/pipelining/test_pipe 1/1 ... [2025-12-04 10:38:33.456802][5222754.435839027], took 0.05min 2025-12-04T10:38:33.4575761Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:38:33.4587854Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:38:33.4588792Z Running distributed/test_compute_comm_reordering 1/1 ... [2025-12-04 10:38:33.458790][5222754.437831611] 2025-12-04T10:38:33.4589014Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:38:33.4592178Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_compute_comm_reordering.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:38:33.459010] 2025-12-04T10:40:14.3257846Z 2025-12-04T10:40:14.3258462Z distributed/test_compute_comm_reordering 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_compute_comm_reordering_1.1_3dd7817ad3b3b53e_.log 2025-12-04T10:40:14.3260625Z Running 9 items in this shard: test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_grouped_scheduler_node_combo_kernels_False, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_grouped_scheduler_node_combo_kernels_True, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_inductor_default_comms_ordering, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_nccl_heuristics, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_raise_comms, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_reorder_compute_for_overlap, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_reorder_compute_for_overlap_custom_runtime_estimation, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_sink_waits, test/distributed/test_compute_comm_reordering.py::TestComputeCommReorderingMultiProc::test_sink_waits_raise_comms 2025-12-04T10:40:14.3262897Z 2025-12-04T10:40:14.3263045Z Finished distributed/test_compute_comm_reordering 1/1 ... [2025-12-04 10:40:14.325489][5222855.304526742], took 1.68min 2025-12-04T10:40:14.3263497Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:40:14.3275345Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:40:14.3277945Z Running distributed/tensor/test_dtensor 3/3 ... [2025-12-04 10:40:14.327714][5222855.306755563] 2025-12-04T10:40:14.3278148Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:40:14.3280336Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_dtensor.py', '--shard-id=3', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:40:14.327913] 2025-12-04T10:41:11.4366145Z 2025-12-04T10:41:11.4367230Z distributed/tensor/test_dtensor 3/3 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_3.3_ee63d11d23a1f90e_.log 2025-12-04T10:41:11.4373237Z Running 25 items in this shard: test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_save_load_import, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_then_to_local, test/distributed/tensor/test_dtensor.py::DTensorTest::test_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_modules_w_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_shard_tensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_to_local, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_constructor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_save_load, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_negative_dim, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_then_to_local, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_full_tensor_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_full_tensor_sync, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_shard_tensor_2d, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_to_local_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_default_value_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_device_mesh_device_conversion, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_metadata_consistency_check, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_spec_local_shard_offset, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_inplace_on_local_tensor_view, test/distributed/tensor/test_dtensor.py::TestDTensorPlacementTypesWithLocalTensor::test_split_tensor_1D, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_default_shard_order, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_print, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_with_invalid_shard_order, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_update 2025-12-04T10:41:11.4376540Z 2025-12-04T10:41:11.4376672Z Finished distributed/tensor/test_dtensor 3/3 ... [2025-12-04 10:41:11.436374][5222912.415411396], took 0.95min 2025-12-04T10:41:11.4377109Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:41:11.4380911Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:41:11.4383166Z Running distributed/test_aten_comm_compute_reordering 3/3 ... [2025-12-04 10:41:11.438243][5222912.417284332] 2025-12-04T10:41:11.4383388Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:41:11.4385547Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_aten_comm_compute_reordering.py', '--shard-id=3', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:41:11.438455] 2025-12-04T10:43:29.4169478Z 2025-12-04T10:43:29.4170761Z distributed/test_aten_comm_compute_reordering 3/3 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_aten_comm_compute_reordering_3.3_488211ab712b5ae3_.log 2025-12-04T10:43:29.4177212Z Running 15 items in this shard: test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingMultiProc::test_grouped_scheduler_node, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingMultiProc::test_reorder_compute_for_overlap_mul, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingMultiProc::test_schedulable_wait, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_basic_all_reduce_bucketing, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_multidtype_bucketing, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_raise_comms, test/distributed/test_aten_comm_compute_reordering.py::TestComputeCommReorderingBucketing::test_reorder_compute_for_overlap_mul, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_bucketing_reordering_pass_no_bucket, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_bucketing_reordering_pass_single_bucket, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_custom_estimator_for_non_compute_nodes, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_grouped_scheduler_node, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_make_graph_view_and_get_subgraph_by_path_custom_module_stack_fn, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_overlap_scheduling_via_config, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_schedulable_wait, test/distributed/test_aten_comm_compute_reordering.py::TestManualOverlapBucketing::test_sink_waits_raise_comms 2025-12-04T10:43:29.4182021Z 2025-12-04T10:43:29.4182232Z Finished distributed/test_aten_comm_compute_reordering 3/3 ... [2025-12-04 10:43:29.416627][5223050.395664135], took 2.30min 2025-12-04T10:43:29.4182888Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:43:29.4185675Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:43:29.4186647Z Running distributed/tensor/test_redistribute 2/2 ... [2025-12-04 10:43:29.418527][5223050.397568321] 2025-12-04T10:43:29.4186964Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:43:29.4188916Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_redistribute.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:43:29.418747] 2025-12-04T10:44:33.9913644Z 2025-12-04T10:44:33.9914804Z distributed/tensor/test_redistribute 2/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_redistribute_2.2_6e81ee66e2d44373_.log 2025-12-04T10:44:33.9929320Z Running 33 items in this shard: test/distributed/tensor/test_redistribute.py::RedistributeTest::test_one_chunk_mesh, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_partial_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_partial_to_shard_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_local_partial_grad_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_local_partial_grad_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_shard_forward_backward, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_dim_alltoall_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_dim_alltoall_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_to_replicate_forward_backward_complex64, test/distributed/tensor/test_redistribute.py::MultiDimRedistributeTest::test_redistribute_shard_dim_multi_dim_mesh, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_generate_shard_orders, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_ordered_distribute_all_combination, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_ordered_redistribute_with_partial, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_shard_order_same_data_as_strided_shard, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_one_chunk_mesh, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_replicate_forward_backward_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_shard_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_negative_shard_dim, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_to_partial, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_uneven_sharding, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_partial, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_replicate_forward_backward, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_dim_alltoall_float32, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::MultiDimRedistributeTestWithLocalTensor::test_multi_dim_mesh, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_generate_shard_orders, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute_for_special_placement, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute_with_partial 2025-12-04T10:44:33.9935024Z 2025-12-04T10:44:33.9935166Z Finished distributed/tensor/test_redistribute 2/2 ... [2025-12-04 10:44:33.991153][5223114.970188501], took 1.08min 2025-12-04T10:44:33.9935667Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:44:33.9936069Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:44:33.9937990Z Running distributed/tensor/test_tensor_ops 3/4 ... [2025-12-04 10:44:33.993685][5223114.972726198] 2025-12-04T10:44:33.9938195Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:44:33.9940549Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_tensor_ops.py', '--shard-id=3', '--num-shards=4', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:44:33.993922] 2025-12-04T10:45:17.5162434Z 2025-12-04T10:45:17.5163416Z distributed/tensor/test_tensor_ops 3/4 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_tensor_ops_3.4_b6a2a119f3629247_.log 2025-12-04T10:45:17.5170146Z Running 17 items in this shard: test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_clone, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_copy_, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_dtensor_dtype_conversion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_fill_inplace_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_full_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_inplace_op, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_new_empty_strided, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_where_type_promotion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_aten_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_equal, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_full_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_inplace_op, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_scatter, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_where_type_promotion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_zero_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_zeros_like_partial_sum 2025-12-04T10:45:17.5175232Z 2025-12-04T10:45:17.5175448Z Finished distributed/tensor/test_tensor_ops 3/4 ... [2025-12-04 10:45:17.515931][5223158.49496766], took 0.73min 2025-12-04T10:45:17.5176181Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:45:17.5179349Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:45:17.5179802Z Running distributed/test_device_mesh 1/2 ... [2025-12-04 10:45:17.517872][5223158.496913055] 2025-12-04T10:45:17.5180112Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:45:17.5182682Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_device_mesh.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:45:17.518097] 2025-12-04T10:47:06.7533984Z 2025-12-04T10:47:06.7535049Z distributed/test_device_mesh 1/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_device_mesh_1.2_c017d3d7efedc09c_.log 2025-12-04T10:47:06.7547353Z Running 34 items in this shard: test/distributed/test_device_mesh.py::DeviceMeshSetDeviceTest::test_auto_set_device_from_heuristic, test/distributed/test_device_mesh.py::DeviceMeshSetDeviceTest::test_manual_set_device, test/distributed/test_device_mesh.py::DeviceMeshTest::test_2d_mesh_eager_init_subgroup, test/distributed/test_device_mesh.py::DeviceMeshTest::test_2d_mesh_non_eager_init_subgroup, test/distributed/test_device_mesh.py::DeviceMeshTest::test_assert_invalid_mesh_tensor, test/distributed/test_device_mesh.py::DeviceMeshTest::test_device_mesh_2d, test/distributed/test_device_mesh.py::DeviceMeshTest::test_device_mesh_init_backend, test/distributed/test_device_mesh.py::DeviceMeshTest::test_fake_pg_device_mesh, test/distributed/test_device_mesh.py::DeviceMeshTest::test_from_group_with_invalid_mesh, test/distributed/test_device_mesh.py::DeviceMeshTest::test_get_group_and_get_all_groups, test/distributed/test_device_mesh.py::DeviceMeshTest::test_get_local_rank, test/distributed/test_device_mesh.py::DeviceMeshTest::test_get_local_rank_raises_exception, test/distributed/test_device_mesh.py::DeviceMeshTest::test_get_root_mesh_multiple_independent_meshes, test/distributed/test_device_mesh.py::DeviceMeshTest::test_init_process_group, test/distributed/test_device_mesh.py::InitDeviceMeshTest::test_backend_override_argument_dict_with_idx_and_backend_eager, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_concatenate_3d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_flatten_mesh_1d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_flatten_mesh_4d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_get_item_1d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_get_item_3d_noncontiguous_slicing, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_reconstruct_mesh_with_flatten_dim, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_unflatten_mesh_2d, test/distributed/test_device_mesh.py::TestDeviceMeshGetItem::test_unflatten_mesh_3d, test/distributed/test_device_mesh.py::TestMeshEnv::test_get_mesh_dim_by_name, test/distributed/test_device_mesh.py::TestMeshEnv::test_get_root_mesh, test/distributed/test_device_mesh.py::TestMeshEnv::test_mesh_slice_fake_tensor_mode, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_all_gather_uneven, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_broadcast_1d, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_scatter_nd, test/distributed/test_device_mesh.py::DeviceMeshCollectiveTest::test_scatter_uneven, test/distributed/test_device_mesh.py::CuTeLayoutTest::test_coalesce, test/distributed/test_device_mesh.py::CuTeLayoutTest::test_coalesce_non_coalescible, test/distributed/test_device_mesh.py::CuTeLayoutTest::test_complement_n_group_layout, test/distributed/test_device_mesh.py::CuTeLayoutTest::test_remap_to_tensor 2025-12-04T10:47:06.7554132Z 2025-12-04T10:47:06.7554308Z Finished distributed/test_device_mesh 1/2 ... [2025-12-04 10:47:06.753614][5223267.732650515], took 1.82min 2025-12-04T10:47:06.7554939Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:47:06.7557315Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:47:06.7560040Z Running distributed/tensor/test_convolution_ops 1/1 ... [2025-12-04 10:47:06.755888][5223267.734929115] 2025-12-04T10:47:06.7560342Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:47:06.7562440Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_convolution_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:47:06.756125] 2025-12-04T10:48:37.0087935Z 2025-12-04T10:48:37.0089116Z distributed/tensor/test_convolution_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_convolution_ops_1.1_6498de81e25b02fc_.log 2025-12-04T10:48:37.0097624Z Running 16 items in this shard: test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTest::test_conv1d, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTest::test_conv2d_module_no_bias, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTest::test_conv2d_no_bias_backward, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTest::test_conv2d_no_bias_compile, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTest::test_conv3d, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTest::test_conv_backward_none_grad_inp, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTest::test_depthwise_convolution, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTest::test_downsampling_convolution, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTestWithLocalTensor::test_conv1d, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTestWithLocalTensor::test_conv2d_module_no_bias, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTestWithLocalTensor::test_conv2d_no_bias_backward, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTestWithLocalTensor::test_conv2d_no_bias_compile, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTestWithLocalTensor::test_conv3d, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTestWithLocalTensor::test_conv_backward_none_grad_inp, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTestWithLocalTensor::test_depthwise_convolution, test/distributed/tensor/test_convolution_ops.py::DistConvolutionOpsTestWithLocalTensor::test_downsampling_convolution 2025-12-04T10:48:37.0101448Z 2025-12-04T10:48:37.0101666Z Finished distributed/tensor/test_convolution_ops 1/1 ... [2025-12-04 10:48:37.008418][5223357.987457145], took 1.50min 2025-12-04T10:48:37.0102335Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:48:37.0102922Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:48:37.0105133Z Running distributed/tensor/parallel/test_tp_style 1/1 ... [2025-12-04 10:48:37.010348][5223357.98938905] 2025-12-04T10:48:37.0105451Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:48:37.0106760Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/parallel/test_tp_style.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:48:37.010537] 2025-12-04T10:49:34.5124015Z 2025-12-04T10:49:34.5127201Z distributed/tensor/parallel/test_tp_style 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.parallel.test_tp_style_1.1_3604b571f850ed4b_.log 2025-12-04T10:49:34.5133851Z Running 18 items in this shard: test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_colwise_parallel_embedding, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_colwise_parallel_style, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_prepare_module_input, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_prepare_module_input_multiple_inputs, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_prepare_module_kwargs_input, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_prepare_module_output, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_rowwise_parallel_embedding, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_rowwise_parallel_style, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTest::test_sequence_parallel_style, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_colwise_parallel_embedding, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_colwise_parallel_style, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_prepare_module_input, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_prepare_module_input_multiple_inputs, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_prepare_module_kwargs_input, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_prepare_module_output, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_rowwise_parallel_embedding, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_rowwise_parallel_style, test/distributed/tensor/parallel/test_tp_style.py::TensorParallelStyleTestWithLocalTensor::test_sequence_parallel_style 2025-12-04T10:49:34.5140170Z 2025-12-04T10:49:34.5140452Z Finished distributed/tensor/parallel/test_tp_style 1/1 ... [2025-12-04 10:49:34.512160][5223415.491195863], took 0.96min 2025-12-04T10:49:34.5141136Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:49:34.5146503Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:49:34.5148922Z Running distributed/test_debug 1/1 ... [2025-12-04 10:49:34.514792][5223415.493833389] 2025-12-04T10:49:34.5149172Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:49:34.5151119Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_debug.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:49:34.514987] 2025-12-04T10:49:36.9331249Z 2025-12-04T10:49:36.9332357Z distributed/test_debug 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_debug_1.1_ae75ebf1c3ebb08e_.log 2025-12-04T10:49:36.9333206Z Running 1 items in this shard: test/distributed/test_debug.py::TestDebug::test_all 2025-12-04T10:49:36.9333487Z 2025-12-04T10:49:36.9333745Z Finished distributed/test_debug 1/1 ... [2025-12-04 10:49:36.932844][5223417.91188011], took 0.04min 2025-12-04T10:49:36.9343351Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:49:36.9354050Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:49:36.9356650Z Running distributed/test_overlap_bucketing_unit 1/1 ... [2025-12-04 10:49:36.935536][5223417.914576695] 2025-12-04T10:49:36.9357040Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:49:36.9358849Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_overlap_bucketing_unit.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:49:36.935747] 2025-12-04T10:49:42.9091604Z 2025-12-04T10:49:42.9092523Z distributed/test_overlap_bucketing_unit 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_overlap_bucketing_unit_1.1_763225eee4d9b259_.log 2025-12-04T10:49:42.9094643Z Running 9 items in this shard: test/distributed/test_overlap_bucketing_unit.py::TestOverlapPreservingBucketing::test_can_bucket_all_reduce, test/distributed/test_overlap_bucketing_unit.py::TestOverlapPreservingBucketing::test_can_bucket_independent_collectives, test/distributed/test_overlap_bucketing_unit.py::TestOverlapPreservingBucketing::test_can_bucket_multidtype_collectives, test/distributed/test_overlap_bucketing_unit.py::TestOverlapPreservingBucketing::test_can_bucket_with_convert_dtype_as_hiding_nodes, test/distributed/test_overlap_bucketing_unit.py::TestOverlapPreservingBucketing::test_can_bucket_with_multiple_hiding_nodes, test/distributed/test_overlap_bucketing_unit.py::TestOverlapPreservingBucketing::test_cant_bucket_ag_with_rs_hiding_interval_between_final_mm_hidden_False, test/distributed/test_overlap_bucketing_unit.py::TestOverlapPreservingBucketing::test_cant_bucket_ag_with_rs_hiding_interval_between_final_mm_hidden_True, test/distributed/test_overlap_bucketing_unit.py::TestOverlapPreservingBucketing::test_cant_bucket_nested_hiding_intervals, test/distributed/test_overlap_bucketing_unit.py::TestCrossPGOverlap::test_cross_pg_prefetch_during_exposed_wait 2025-12-04T10:49:42.9096911Z 2025-12-04T10:49:42.9097060Z Finished distributed/test_overlap_bucketing_unit 1/1 ... [2025-12-04 10:49:42.908927][5223423.887965904], took 0.10min 2025-12-04T10:49:42.9097558Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:49:42.9107514Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:49:42.9110178Z Running distributed/checkpoint/_experimental/test_checkpoint_writer 1/1 ... [2025-12-04 10:49:42.910929][5223423.889969468] 2025-12-04T10:49:42.9110435Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:49:42.9112208Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_checkpoint_writer.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:49:42.911123] 2025-12-04T10:49:45.1792749Z 2025-12-04T10:49:45.1794206Z distributed/checkpoint/_experimental/test_checkpoint_writer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_checkpoint_writer_1.1_94fc265d2f2ccc8a_.log 2025-12-04T10:49:45.1798996Z Running 8 items in this shard: test/distributed/checkpoint/_experimental/test_checkpoint_writer.py::TestCheckpointWriterConfig::test_custom_values, test/distributed/checkpoint/_experimental/test_checkpoint_writer.py::TestCheckpointWriterConfig::test_default_values, test/distributed/checkpoint/_experimental/test_checkpoint_writer.py::TestCheckpointWriter::test_close, test/distributed/checkpoint/_experimental/test_checkpoint_writer.py::TestCheckpointWriter::test_write_calls_barrier, test/distributed/checkpoint/_experimental/test_checkpoint_writer.py::TestCheckpointWriter::test_write_calls_commit_hooks, test/distributed/checkpoint/_experimental/test_checkpoint_writer.py::TestCheckpointWriter::test_write_creates_checkpoint_file, test/distributed/checkpoint/_experimental/test_checkpoint_writer.py::TestCheckpointWriter::test_write_without_barrier, test/distributed/checkpoint/_experimental/test_checkpoint_writer.py::TestCheckpointWriter::test_write_without_commit_hook 2025-12-04T10:49:45.1802093Z 2025-12-04T10:49:45.1802456Z Finished distributed/checkpoint/_experimental/test_checkpoint_writer 1/1 ... [2025-12-04 10:49:45.178980][5223426.158017733], took 0.04min 2025-12-04T10:49:45.1803468Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:49:45.1807858Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:49:45.1810166Z Running distributed/optim/test_named_optimizer 1/1 ... [2025-12-04 10:49:45.180937][5223426.159978858] 2025-12-04T10:49:45.1810506Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:49:45.1812694Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/optim/test_named_optimizer.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:49:45.181156] 2025-12-04T10:49:46.3681954Z 2025-12-04T10:49:46.3682849Z distributed/optim/test_named_optimizer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.optim.test_named_optimizer_1.1_14d96f15037e7378_.log 2025-12-04T10:49:46.3683412Z 2025-12-04T10:49:46.3683673Z Finished distributed/optim/test_named_optimizer 1/1 ... [2025-12-04 10:49:46.367922][5223427.346957067], took 0.02min 2025-12-04T10:49:46.3688861Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:49:46.3699879Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:49:46.3702392Z Running distributed/checkpoint/_experimental/test_checkpointer 1/1 ... [2025-12-04 10:49:46.370151][5223427.349192458] 2025-12-04T10:49:46.3702743Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:49:46.3704801Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_checkpointer.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:49:46.370367] 2025-12-04T10:50:07.0660661Z 2025-12-04T10:50:07.0661695Z distributed/checkpoint/_experimental/test_checkpointer 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_checkpointer_1.1_4c484a42515acebe_.log 2025-12-04T10:50:07.0666777Z Running 11 items in this shard: test/distributed/checkpoint/_experimental/test_checkpointer.py::TestCheckpointer::test_load_strict_mode, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestCheckpointer::test_load_with_map_location, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestCheckpointer::test_nested_dict_partial_load, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestCheckpointer::test_partial_load, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestCheckpointer::test_save_and_load_basic, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestCheckpointer::test_save_with_kwargs, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestAsyncCheckpointerSpecific::test_async_error_handling, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestAsyncCheckpointerSpecific::test_async_future_results, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestAsyncCheckpointerSpecific::test_async_multiple_saves_ordering, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestAsyncCheckpointerSpecific::test_async_returns_futures, test/distributed/checkpoint/_experimental/test_checkpointer.py::TestAsyncCheckpointerSpecific::test_async_sequential_saves_wait 2025-12-04T10:50:07.0670602Z 2025-12-04T10:50:07.0670918Z Finished distributed/checkpoint/_experimental/test_checkpointer 1/1 ... [2025-12-04 10:50:07.065734][5223448.044768197], took 0.34min 2025-12-04T10:50:07.0671832Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:50:07.0681825Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:50:07.0684127Z Running distributed/tensor/test_api 1/1 ... [2025-12-04 10:50:07.068276][5223448.047316745] 2025-12-04T10:50:07.0684399Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:50:07.0686036Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:50:07.068497] 2025-12-04T10:51:01.8228151Z 2025-12-04T10:51:01.8232018Z distributed/tensor/test_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_api_1.1_a0e6e6cdf7c9d61c_.log 2025-12-04T10:51:01.8238224Z Running 18 items in this shard: test/distributed/tensor/test_api.py::DTensorAPITest::test_checkpoint_apis_check_partial_placement, test/distributed/tensor/test_api.py::DTensorAPITest::test_distribute_module, test/distributed/tensor/test_api.py::DTensorAPITest::test_distribute_module_casting, test/distributed/tensor/test_api.py::DTensorAPITest::test_distribute_module_input_fn_output_fn, test/distributed/tensor/test_api.py::DTensorAPITest::test_distribute_module_input_fn_output_fn_warning, test/distributed/tensor/test_api.py::DTensorAPITest::test_distribute_module_meta, test/distributed/tensor/test_api.py::DTensorAPITest::test_distribute_tensor_errors, test/distributed/tensor/test_api.py::DTensorAPITest::test_distribute_tensor_rank, test/distributed/tensor/test_api.py::DTensorAPITest::test_distribute_tensor_uneven_sharding, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_checkpoint_apis_check_partial_placement, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_distribute_module, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_distribute_module_casting, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_distribute_module_input_fn_output_fn, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_distribute_module_input_fn_output_fn_warning, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_distribute_module_meta, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_distribute_tensor_errors, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_distribute_tensor_rank, test/distributed/tensor/test_api.py::DTensorAPITestWithLocalTensor::test_distribute_tensor_uneven_sharding 2025-12-04T10:51:01.8243293Z 2025-12-04T10:51:01.8243540Z Finished distributed/tensor/test_api 1/1 ... [2025-12-04 10:51:01.822642][5223502.801679633], took 0.91min 2025-12-04T10:51:01.8244225Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:51:01.8247074Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:51:01.8249442Z Running distributed/tensor/test_init 1/1 ... [2025-12-04 10:51:01.824851][5223502.803893015] 2025-12-04T10:51:01.8249925Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:51:01.8251937Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:51:01.825068] 2025-12-04T10:51:35.8442139Z 2025-12-04T10:51:35.8442804Z distributed/tensor/test_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_init_1.1_cead5736e32acdcb_.log 2025-12-04T10:51:35.8444985Z Running 13 items in this shard: test/distributed/tensor/test_init.py::DTensorInitOpsTest::test_init_ops, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_empty, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_full, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_ones, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros_full_mesh, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros_submesh, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_empty, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_full, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_ones, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros_full_mesh, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros_submesh 2025-12-04T10:51:35.8447267Z 2025-12-04T10:51:35.8447395Z Finished distributed/tensor/test_init 1/1 ... [2025-12-04 10:51:35.843982][5223536.823018951], took 0.57min 2025-12-04T10:51:35.8447968Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:51:35.8458504Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:51:35.8461095Z Running distributed/checkpoint/e2e/test_fine_tuning 1/1 ... [2025-12-04 10:51:35.846007][5223536.825048315] 2025-12-04T10:51:35.8461320Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:51:35.8463267Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/e2e/test_fine_tuning.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:51:35.846223] 2025-12-04T10:51:46.3771657Z 2025-12-04T10:51:46.3773151Z distributed/checkpoint/e2e/test_fine_tuning 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.e2e.test_fine_tuning_1.1_f56a2870d773053c_.log 2025-12-04T10:51:46.3774293Z Running 1 items in this shard: test/distributed/checkpoint/e2e/test_fine_tuning.py::TestFineTuning::test_fine_tuning 2025-12-04T10:51:46.3774730Z 2025-12-04T10:51:46.3775057Z Finished distributed/checkpoint/e2e/test_fine_tuning 1/1 ... [2025-12-04 10:51:46.377087][5223547.356123006], took 0.18min 2025-12-04T10:51:46.3782533Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:51:46.3794447Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:51:46.3796823Z Running distributed/tensor/test_matrix_ops 1/1 ... [2025-12-04 10:51:46.379600][5223547.358641694] 2025-12-04T10:51:46.3797171Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:51:46.3800077Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_matrix_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:51:46.379812] 2025-12-04T10:53:28.4781104Z 2025-12-04T10:53:28.4781990Z distributed/tensor/test_matrix_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_matrix_ops_1.1_9b0fbc70f0d13ab5_.log 2025-12-04T10:53:28.4786291Z Running 30 items in this shard: test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm_auto_redistribute, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm_empty_operand, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_baddbmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_bmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_dtensor_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_grouped_mm_kwargs0, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_grouped_mm_kwargs1, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_matmul, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_scaled_dot_product_attention, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_scaled_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_t, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_t_partial, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_tensordot_shampoo, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm_auto_redistribute, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm_empty_operand, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_baddbmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_bmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_dtensor_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_grouped_mm_kwargs0, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_grouped_mm_kwargs1, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_matmul, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_scaled_dot_product_attention, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_scaled_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_t, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_t_partial, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_tensordot_shampoo 2025-12-04T10:53:28.4791137Z 2025-12-04T10:53:28.4791281Z Finished distributed/tensor/test_matrix_ops 1/1 ... [2025-12-04 10:53:28.477791][5223649.456830324], took 1.70min 2025-12-04T10:53:28.4791759Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:53:28.4796645Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:53:28.4799010Z Running distributed/pipelining/test_stage 1/1 ... [2025-12-04 10:53:28.479822][5223649.458863738] 2025-12-04T10:53:28.4799239Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:53:28.4801258Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/pipelining/test_stage.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:53:28.480013] 2025-12-04T10:53:55.5455883Z 2025-12-04T10:53:55.5457472Z distributed/pipelining/test_stage 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_stage_1.1_55b799f0e82c0299_.log 2025-12-04T10:53:55.5459437Z Running 8 items in this shard: test/distributed/pipelining/test_stage.py::StageTest::test_custom_dw_with_fb_schedule, test/distributed/pipelining/test_stage.py::StageTest::test_manual, test/distributed/pipelining/test_stage.py::StageTest::test_output_chunks_memory_usage, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_ModelClass0, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_ModelClass1, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_kwargs_ModelClass0, test/distributed/pipelining/test_stage.py::StageNegativeTest::test_custom_dw_errors, test/distributed/pipelining/test_stage.py::StageNegativeTest::test_shape_prop_mismatch 2025-12-04T10:53:55.5460995Z 2025-12-04T10:53:55.5461202Z Finished distributed/pipelining/test_stage 1/1 ... [2025-12-04 10:53:55.545358][5223676.524394337], took 0.45min 2025-12-04T10:53:55.5464930Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:53:55.5476033Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:53:55.5478059Z Running distributed/tensor/parallel/test_tp_random_state 1/1 ... [2025-12-04 10:53:55.547715][5223676.526756487] 2025-12-04T10:53:55.5478498Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:53:55.5480632Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/parallel/test_tp_random_state.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:53:55.547930] 2025-12-04T10:54:03.8750673Z 2025-12-04T10:54:03.8751989Z distributed/tensor/parallel/test_tp_random_state 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.parallel.test_tp_random_state_1.1_3ff170dcbb1f0e74_.log 2025-12-04T10:54:03.8753529Z Running 1 items in this shard: test/distributed/tensor/parallel/test_tp_random_state.py::TensorParallelRandomStateTests::test_model_init 2025-12-04T10:54:03.8754132Z 2025-12-04T10:54:03.8754579Z Finished distributed/tensor/parallel/test_tp_random_state 1/1 ... [2025-12-04 10:54:03.874707][5223684.853742402], took 0.14min 2025-12-04T10:54:03.8759754Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:54:03.8773191Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:54:03.8774063Z Running distributed/checkpoint/test_planner 1/1 ... [2025-12-04 10:54:03.877247][5223684.856288749] 2025-12-04T10:54:03.8774340Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:54:03.8776590Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_planner.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:54:03.877471] 2025-12-04T10:54:06.0954195Z 2025-12-04T10:54:06.0955222Z distributed/checkpoint/test_planner 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_planner_1.1_c0fc3fc5e7160f63_.log 2025-12-04T10:54:06.0961715Z Running 17 items in this shard: test/distributed/checkpoint/test_planner.py::TestSavePlan::test_dedup_plans, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_finish_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_global_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_global_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_load_with_resharding, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_load_with_world_size_diff_by_one, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_load_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_compare_save_plans, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_create_read_item_from_chunks, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_merge_delta_local_plans, test/distributed/checkpoint/test_planner.py::TestValidateGlobalPlan::test_detect_overlapping_chunks, test/distributed/checkpoint/test_planner.py::TestValidateGlobalPlan::test_non_overlapping_chunks, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_load_different_sizes_throws, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_strict, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_version_key_in_planner_data 2025-12-04T10:54:06.0966680Z 2025-12-04T10:54:06.0966988Z Finished distributed/checkpoint/test_planner 1/1 ... [2025-12-04 10:54:06.095178][5223687.074212342], took 0.04min 2025-12-04T10:54:06.0967998Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:54:06.0976181Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:54:06.0978442Z Running distributed/checkpoint/test_dtensor_checkpoint 1/1 ... [2025-12-04 10:54:06.097755][5223687.076796169] 2025-12-04T10:54:06.0978751Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:54:06.0981276Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_dtensor_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:54:06.097983] 2025-12-04T10:54:13.2752499Z 2025-12-04T10:54:13.2753687Z distributed/checkpoint/test_dtensor_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_dtensor_checkpoint_1.1_889486be9b42fe95_.log 2025-12-04T10:54:13.2755124Z Running 1 items in this shard: test/distributed/checkpoint/test_dtensor_checkpoint.py::DTensorPlanner::test_distributed_tensor_planner 2025-12-04T10:54:13.2755739Z 2025-12-04T10:54:13.2756175Z Finished distributed/checkpoint/test_dtensor_checkpoint 1/1 ... [2025-12-04 10:54:13.275007][5223694.254041466], took 0.12min 2025-12-04T10:54:13.2761906Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:54:13.2771789Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:54:13.2773883Z Running distributed/pipelining/test_schedule 1/1 ... [2025-12-04 10:54:13.277266][5223694.256307927] 2025-12-04T10:54:13.2774218Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:54:13.2776293Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/pipelining/test_schedule.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:54:13.277483] 2025-12-04T10:54:52.4951120Z 2025-12-04T10:54:52.4951940Z distributed/pipelining/test_schedule 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_schedule_1.1_5adcec24b68fb3cf_.log 2025-12-04T10:54:52.4962891Z Running 43 items in this shard: test/distributed/pipelining/test_schedule.py::ScheduleTest::test_get_schedule_class, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass2, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass3, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass4, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass2, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass3, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass4, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass2, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_flex_and_zero_bubble_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_flex_and_zero_bubble_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_for_v_schedules_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_for_v_schedules_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestScheduleCsv::test_csv_compare_ScheduleClass0_csv_name_dualpipev_4rank_10mb, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref2, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref3, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref4, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref5, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref6, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref7, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_csv_csv_name_zb1p_2rank_2stagep, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_grad_with_split_b_w, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_grad_with_v_schedule, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_merge_bw_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_reduce_grad_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_reduce_grad_test_info1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_send_recv_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_send_recv_test_info1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_unshard_reshard_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_unshard_reshard_test_info1, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_invalid_schedule_missing_action, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_invalid_schedule_missing_rank, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_valid_schedule, test/distributed/pipelining/test_schedule.py::ScheduleUtilTests::test_generate_stage_to_rank_mapping 2025-12-04T10:54:52.4971295Z 2025-12-04T10:54:52.4971512Z Finished distributed/pipelining/test_schedule 1/1 ... [2025-12-04 10:54:52.494740][5223733.473778975], took 0.65min 2025-12-04T10:54:52.4972021Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:54:52.4972469Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:54:52.4972767Z Running distributed/_composable/fsdp/test_fully_shard_overlap 1/1 ... [2025-12-04 10:54:52.496889][5223733.475929578] 2025-12-04T10:54:52.4973022Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:54:52.4973519Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_overlap.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:54:52.497085] 2025-12-04T10:55:02.7280847Z 2025-12-04T10:55:02.7281791Z distributed/_composable/fsdp/test_fully_shard_overlap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_overlap_1.1_01893b58c1154003_.log 2025-12-04T10:55:02.7283080Z Running 2 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_overlap.py::TestFullyShardOverlap::test_fully_shard_post_optim_event_overlap, test/distributed/_composable/fsdp/test_fully_shard_overlap.py::TestFullyShardOverlap::test_fully_shard_training_overlap 2025-12-04T10:55:02.7283543Z 2025-12-04T10:55:02.7283717Z Finished distributed/_composable/fsdp/test_fully_shard_overlap 1/1 ... [2025-12-04 10:55:02.727702][5223743.706737574], took 0.17min 2025-12-04T10:55:02.7290514Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:55:02.7304843Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:55:02.7306806Z Running distributed/test_run 1/1 ... [2025-12-04 10:55:02.730532][5223743.709573408] 2025-12-04T10:55:02.7307269Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:55:02.7308854Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_run.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:55:02.730740] 2025-12-04T10:55:04.9488048Z 2025-12-04T10:55:04.9489047Z distributed/test_run 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_run_1.1_ae77be8219a4a84d_.log 2025-12-04T10:55:04.9491124Z Running 4 items in this shard: test/distributed/test_run.py::RunTest::test_config_from_args_signals_to_handle, test/distributed/test_run.py::RunTest::test_launch_agent_sets_environment_variable, test/distributed/test_run.py::RunTest::test_signals_to_handle_custom, test/distributed/test_run.py::RunTest::test_signals_to_handle_default 2025-12-04T10:55:04.9492425Z 2025-12-04T10:55:04.9492740Z Finished distributed/test_run 1/1 ... [2025-12-04 10:55:04.948422][5223745.927458034], took 0.04min 2025-12-04T10:55:04.9499065Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:55:04.9513489Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:55:04.9514750Z Running distributed/tensor/test_math_ops 1/1 ... [2025-12-04 10:55:04.951334][5223745.930374486] 2025-12-04T10:55:04.9515463Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:55:04.9517847Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_math_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:55:04.951570] 2025-12-04T10:57:40.1900990Z 2025-12-04T10:57:40.1901879Z distributed/tensor/test_math_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_math_ops_1.1_4b2d9d7d9577b7a3_.log 2025-12-04T10:57:40.1913777Z Running 54 items in this shard: test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_conj_complex_dtensor, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_cumsum, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_add_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_histc, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_bwd_req_grad, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_linalg_eigh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_linear_op_reductions, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_logsumexp, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_matching_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_mean, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_nll_loss_and_cross_entropy, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_rotary_embedding_complex_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_shard0_svd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_shard_math_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_softmax_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_softmax_with_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_std, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_topk, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_upsampling, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_vector_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_vector_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_conj_complex_dtensor, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_cumsum, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_add_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_histc, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_bwd_req_grad, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_linalg_eigh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_linear_op_reductions, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_logsumexp, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_matching_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_mean, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_nll_loss_and_cross_entropy, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_rotary_embedding_complex_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_shard0_svd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_shard_math_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_softmax_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_softmax_with_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_std, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_topk, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_upsampling, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_vector_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_vector_norm_partial 2025-12-04T10:57:40.1922529Z 2025-12-04T10:57:40.1922674Z Finished distributed/tensor/test_math_ops 1/1 ... [2025-12-04 10:57:40.189780][5223901.168818333], took 2.59min 2025-12-04T10:57:40.1923180Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T10:57:40.1923584Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T10:57:40.1923811Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T10:57:40.1923999Z Uploading artifacts took 0.00 seconds 2025-12-04T10:57:40.1924508Z Running distributed/tensor/test_pointwise_ops 1/1 ... [2025-12-04 10:57:40.192336][5223901.171368421] 2025-12-04T10:57:40.1924726Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T10:57:40.1926564Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_pointwise_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 10:57:40.192563] 2025-12-04T11:27:51.1753855Z 2025-12-04T11:27:51.1754713Z PRINTING LOG FILE of distributed/tensor/test_pointwise_ops 1/1 (test/test-reports/distributed.tensor.test_pointwise_ops_1.1_b92502ef2d44a395_.log) 2025-12-04T11:27:51.1755685Z Test results will be stored in test-reports/python-pytest/distributed.tensor.test_pointwise_ops/distributed.tensor.test_pointwise_ops-7404e7f097e17487.xml 2025-12-04T11:27:51.1756346Z ============================= test session starts ============================== 2025-12-04T11:27:51.1756858Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:27:51.1757270Z cachedir: .pytest_cache 2025-12-04T11:27:51.1757799Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:27:51.1758325Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:27:51.1758564Z configfile: pytest.ini 2025-12-04T11:27:51.1759026Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:27:51.1759531Z collecting ... collected 18 items 2025-12-04T11:27:51.1759989Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T11:27:51.1766370Z Running 18 items in this shard: test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_activations, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_backward, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_errors, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_inplace_op_partial_to_replicate, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_mul_out, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_mul_partial, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_partial_add, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_partial_replicate_add, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_activations, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_dropout, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_dropout_backward, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_dropout_errors, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_inplace_op_partial_to_replicate, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_mul_out, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_mul_partial, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_partial_add, test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_partial_replicate_add 2025-12-04T11:27:51.1770887Z 2025-12-04T11:27:51.1771103Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_activations PASSED [0.5603s] [ 5%] 2025-12-04T11:27:51.1771717Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout SKIPPED [0.0002s] (testing RNG based ops is broken: https://github.com/pytorch/PiPPy/issues/494) [ 11%] 2025-12-04T11:27:51.1772341Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_backward Command took >30min, returning 124 2025-12-04T11:27:51.1772670Z Got exit code 124 2025-12-04T11:27:51.1772813Z Retrying single test... 2025-12-04T11:27:51.1773216Z Test results will be stored in test-reports/python-pytest/distributed.tensor.test_pointwise_ops/distributed.tensor.test_pointwise_ops-26d6a19f76ee9373.xml 2025-12-04T11:27:51.1773652Z ============================= test session starts ============================== 2025-12-04T11:27:51.1773961Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:27:51.1774240Z cachedir: .pytest_cache 2025-12-04T11:27:51.1774556Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:27:51.1774902Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:27:51.1775065Z configfile: pytest.ini 2025-12-04T11:27:51.1775392Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:27:51.1775783Z collecting ... collected 18 items / 17 deselected / 1 selected 2025-12-04T11:27:51.1776224Z stepcurrent: skipping 2 already run items. Running only test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_backward 2025-12-04T11:27:51.1776608Z Running 1 items in this shard 2025-12-04T11:27:51.1776714Z 2025-12-04T11:27:51.1776926Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_backward PASSED [0.4689s] [100%] 2025-12-04T11:27:51.1777191Z 2025-12-04T11:27:51.1777567Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.tensor.test_pointwise_ops/distributed.tensor.test_pointwise_ops-26d6a19f76ee9373.xml - 2025-12-04T11:27:51.1778061Z ======================= 1 passed, 17 deselected in 0.48s ======================= 2025-12-04T11:27:51.1778227Z Got exit code 0 2025-12-04T11:27:51.1778441Z Test succeeded in new process, continuing with the rest of the tests 2025-12-04T11:27:51.1778805Z Test results will be stored in test-reports/python-pytest/distributed.tensor.test_pointwise_ops/distributed.tensor.test_pointwise_ops-a434aa288c83b7b1.xml 2025-12-04T11:27:51.1779141Z ============================= test session starts ============================== 2025-12-04T11:27:51.1779373Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:27:51.1779623Z cachedir: .pytest_cache 2025-12-04T11:27:51.1779875Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:27:51.1780137Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:27:51.1780263Z configfile: pytest.ini 2025-12-04T11:27:51.1780516Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:27:51.1780822Z collecting ... collected 18 items / 3 deselected / 15 selected 2025-12-04T11:27:51.1781005Z stepcurrent: skipping 3 already run items. 2025-12-04T11:27:51.1781152Z Running 15 items in this shard 2025-12-04T11:27:51.1781230Z 2025-12-04T11:27:51.1781394Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_errors PASSED [0.3786s] [ 6%] 2025-12-04T11:27:51.1781782Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_inplace_op_partial_to_replicate PASSED [0.0539s] [ 13%] 2025-12-04T11:27:51.1782149Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_mul_out PASSED [0.0941s] [ 20%] 2025-12-04T11:27:51.1782535Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_mul_partial PASSED [0.0736s] [ 26%] 2025-12-04T11:27:51.1782874Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_partial_add PASSED [0.0110s] [ 33%] 2025-12-04T11:27:51.1786079Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_partial_replicate_add PASSED [0.0268s] [ 40%] 2025-12-04T11:27:51.1786481Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_activations PASSED [0.1851s] [ 46%] 2025-12-04T11:27:51.1787004Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_dropout SKIPPED [0.0002s] (testing RNG based ops is broken: https://github.com/pytorch/PiPPy/issues/494) [ 53%] 2025-12-04T11:27:51.1787534Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_dropout_backward PASSED [0.0201s] [ 60%] 2025-12-04T11:27:51.1787961Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_dropout_errors PASSED [0.0081s] [ 66%] 2025-12-04T11:27:51.1788403Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_inplace_op_partial_to_replicate PASSED [0.0323s] [ 73%] 2025-12-04T11:27:51.1788808Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_mul_out PASSED [0.0414s] [ 80%] 2025-12-04T11:27:51.1789164Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_mul_partial PASSED [0.2208s] [ 86%] 2025-12-04T11:27:51.1789527Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_partial_add PASSED [0.0288s] [ 93%] 2025-12-04T11:27:51.1789947Z distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTestWithLocalTensor::test_partial_replicate_add PASSED [0.1147s] [100%] 2025-12-04T11:27:51.1790159Z 2025-12-04T11:27:51.1790415Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.tensor.test_pointwise_ops/distributed.tensor.test_pointwise_ops-a434aa288c83b7b1.xml - 2025-12-04T11:27:51.1790781Z ================= 14 passed, 1 skipped, 3 deselected in 1.31s ================== 2025-12-04T11:27:51.1791115Z [W1204 11:27:50.971711057 ProcessGroup.cpp:367] Warning: At the time of process termination, there are still 12 unwaited collective calls. Please review your program to ensure that: 2025-12-04T11:27:51.1791529Z 1. c10d_functional.wait_tensor() is invoked on all tensors returned from c10d_functional collective, 2025-12-04T11:27:51.1791899Z 2. c10d_functional.wait_tensor() is invoked on all output tensors of async_op=True torch.distributed collective called under `with allow_inflight_collective_as_graph_input_ctx():`, 2025-12-04T11:27:51.1792243Z before the output tensors of the collective are used. (function ~WorkRegistry) 2025-12-04T11:27:51.1792587Z The following tests failed and then succeeded when run in a new process['test/distributed/tensor/test_pointwise_ops.py::DistElementwiseOpsTest::test_dropout_backward'] 2025-12-04T11:27:51.1792834Z 2025-12-04T11:27:51.1793040Z FINISHED PRINTING LOG FILE of distributed/tensor/test_pointwise_ops 1/1 (test/test-reports/distributed.tensor.test_pointwise_ops_1.1_b92502ef2d44a395_.log) 2025-12-04T11:27:51.1793277Z 2025-12-04T11:27:51.1793410Z Finished distributed/tensor/test_pointwise_ops 1/1 ... [2025-12-04 11:27:51.174672][5225712.153709353], took 30.18min 2025-12-04T11:27:51.1793846Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:27:51.1794241Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:27:51.1794455Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:27:51.1794634Z Uploading artifacts took 0.00 seconds 2025-12-04T11:27:51.1794875Z Running distributed/checkpoint/test_compatibility 1/1 ... [2025-12-04 11:27:51.176855][5225712.155896255] 2025-12-04T11:27:51.1795083Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:27:51.1795503Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_compatibility.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:27:51.177075] 2025-12-04T11:27:53.3449020Z 2025-12-04T11:27:53.3449851Z distributed/checkpoint/test_compatibility 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_compatibility_1.1_70f9582afd7d5a9a_.log 2025-12-04T11:27:53.3451880Z Running 4 items in this shard: test/distributed/checkpoint/test_compatibility.py::TestDCPCompatbility::test_metadata, test/distributed/checkpoint/test_compatibility.py::TestDCPCompatbility::test_sharded_tensor_dependency, test/distributed/checkpoint/test_compatibility.py::TestDCPCompatbility::test_storage_meta, test/distributed/checkpoint/test_compatibility.py::TestDCPCompatbility::test_with_v_2_3 2025-12-04T11:27:53.3453231Z 2025-12-04T11:27:53.3453556Z Finished distributed/checkpoint/test_compatibility 1/1 ... [2025-12-04 11:27:53.344601][5225714.323637149], took 0.04min 2025-12-04T11:27:53.3459855Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:27:53.3470766Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:27:53.3473153Z Running distributed/_tools/test_mem_tracker 1/1 ... [2025-12-04 11:27:53.347231][5225714.326272766] 2025-12-04T11:27:53.3473473Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:27:53.3475615Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_tools/test_mem_tracker.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:27:53.347450] 2025-12-04T11:28:00.7289999Z 2025-12-04T11:28:00.7291431Z distributed/_tools/test_mem_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_mem_tracker_1.1_bc0610b4681d5408_.log 2025-12-04T11:28:00.7292801Z Running 3 items in this shard: test/distributed/_tools/test_mem_tracker.py::TestMemTracker::test_accelerator_tracker_equivalence, test/distributed/_tools/test_mem_tracker.py::TestMemTracker::test_tracker_attribution, test/distributed/_tools/test_mem_tracker.py::TestMemTracker::test_tracker_with_activation_checkpointing 2025-12-04T11:28:00.7293429Z 2025-12-04T11:28:00.7293570Z Finished distributed/_tools/test_mem_tracker 1/1 ... [2025-12-04 11:28:00.728644][5225721.707680062], took 0.12min 2025-12-04T11:28:00.7299213Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:28:00.7310440Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:28:00.7312346Z Running distributed/elastic/test_control_plane 1/1 ... [2025-12-04 11:28:00.731159][5225721.71020011] 2025-12-04T11:28:00.7312573Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:28:00.7314474Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/test_control_plane.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:28:00.731349] 2025-12-04T11:28:03.2997214Z 2025-12-04T11:28:03.2998150Z distributed/elastic/test_control_plane 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.test_control_plane_1.1_f4cb43a87c9834ba_.log 2025-12-04T11:28:03.3002405Z Running 10 items in this shard: test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_dump_nccl_trace_pickle, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_dump_nccl_trace_pickle_with_json, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_dump_nccl_trace_pickle_with_params, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_dump_traceback, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_get_handler_names, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_get_handler_nonexistant, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_run_handler, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_tcp, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_wait_counter_values, test/distributed/elastic/test_control_plane.py::WorkerServerTest::test_worker_server 2025-12-04T11:28:03.3005204Z 2025-12-04T11:28:03.3005498Z Finished distributed/elastic/test_control_plane 1/1 ... [2025-12-04 11:28:03.299403][5225724.278439329], took 0.04min 2025-12-04T11:28:03.3009310Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:28:03.3021620Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:28:03.3024261Z Running distributed/fsdp/test_fsdp_overlap 1/1 ... [2025-12-04 11:28:03.302331][5225724.281372351] 2025-12-04T11:28:03.3024556Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:28:03.3026747Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_overlap.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:28:03.302540] 2025-12-04T11:29:17.7192518Z 2025-12-04T11:29:17.7193472Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_overlap 1/1 (test/test-reports/distributed.fsdp.test_fsdp_overlap_1.1_576f9da47548da7f_.log) 2025-12-04T11:29:17.7194846Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_overlap/distributed.fsdp.test_fsdp_overlap-c29609b993a3a584.xml 2025-12-04T11:29:17.7195768Z ============================= test session starts ============================== 2025-12-04T11:29:17.7196687Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:29:17.7197003Z cachedir: .pytest_cache 2025-12-04T11:29:17.7197427Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:29:17.7197887Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:29:17.7198100Z configfile: pytest.ini 2025-12-04T11:29:17.7198478Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:29:17.7198875Z collecting ... collected 1 item 2025-12-04T11:29:17.7199103Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T11:29:17.7209413Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda 2025-12-04T11:29:17.7209812Z 2025-12-04T11:29:17.7210243Z distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda I1204 11:28:05.059000 127756 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 127825 2025-12-04T11:29:17.7211062Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T11:29:17.7211511Z _init_core_state( 2025-12-04T11:29:17.7212291Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T11:29:17.7213249Z _warn_cpu_init() 2025-12-04T11:29:17.7213499Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:29:17.7213915Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:29:17.7214511Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7215093Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:29:17.7215669Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7216210Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:29:17.7216655Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7217118Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:29:17.7217582Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7218047Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:29:17.7218542Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7218991Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:29:17.7219445Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7219946Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:29:17.7220636Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7221270Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:29:17.7221620Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7222217Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7222762Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:29:17.7223128Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7223546Z [rank0]:E1204 11:28:25.353000 127825 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:29:17.7223789Z dist init r=0, world=1 2025-12-04T11:29:17.7223857Z 2025-12-04T11:29:17.7223892Z rank0: 2025-12-04T11:29:17.7224101Z e1: {'cpu_iter': 0.0008449301000000631, 'cpu_wait': 1.7167000000206656e-05, 'gpu_compute': 0.017279599979519843, 'gpu_total': 0.3514355033636093} 2025-12-04T11:29:17.7224439Z e2: {'cpu_iter': 0.0019283733999998277, 'cpu_wait': 1.8078999999993072e-05, 'gpu_compute': 0.03761139996349812, 'gpu_total': 0.7945138156414032} 2025-12-04T11:29:17.7224765Z e3: {'cpu_iter': 0.0015969616999997882, 'cpu_wait': 0.39530624519999974, 'gpu_compute': 397.0044761657715, 'gpu_total': 397.34800109863284} 2025-12-04T11:29:17.7225078Z e4: {'cpu_iter': 0.0036101315999992776, 'cpu_wait': 0.7498134582999999, 'gpu_compute': 397.0139923095703, 'gpu_total': 397.546044921875} 2025-12-04T11:29:17.7225613Z [rank0]:[W1204 11:28:25.605091950 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:29:17.7226032Z FAILED [21.7298s] [100%] 2025-12-04T11:29:17.7226102Z 2025-12-04T11:29:17.7226161Z =================================== FAILURES =================================== 2025-12-04T11:29:17.7226363Z _________ TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda _________ 2025-12-04T11:29:17.7226552Z Traceback (most recent call last): 2025-12-04T11:29:17.7226807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:29:17.7227055Z self._join_processes(fn) 2025-12-04T11:29:17.7227303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:29:17.7227604Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:29:17.7227874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:29:17.7228133Z raise RuntimeError(error) 2025-12-04T11:29:17.7228289Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:29:17.7228452Z Traceback (most recent call last): 2025-12-04T11:29:17.7228693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7228937Z getattr(self, test_name)() 2025-12-04T11:29:17.7229170Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7229403Z fn() 2025-12-04T11:29:17.7229660Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7229897Z method(*args, **kwargs) 2025-12-04T11:29:17.7230120Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7230351Z method(*args, **kwargs) 2025-12-04T11:29:17.7230569Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7230797Z with policy(): 2025-12-04T11:29:17.7231010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7231278Z raise RuntimeError(msg) 2025-12-04T11:29:17.7231692Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7232067Z 2025-12-04T11:29:17.7232144Z To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7232488Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7232752Z 2025-12-04T11:29:17.7232846Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7232973Z 2025-12-04T11:29:17.7232974Z 2025-12-04T11:29:17.7233057Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:29:17.7233263Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:29:17.7233632Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_overlap/distributed.fsdp.test_fsdp_overlap-c29609b993a3a584.xml - 2025-12-04T11:29:17.7233974Z =========================== short test summary info ============================ 2025-12-04T11:29:17.7234327Z FAILED [21.7298s] distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:29:17.7234656Z Traceback (most recent call last): 2025-12-04T11:29:17.7234902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7235148Z getattr(self, test_name)() 2025-12-04T11:29:17.7235384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7235624Z fn() 2025-12-04T11:29:17.7235826Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7236059Z method(*args, **kwargs) 2025-12-04T11:29:17.7236280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7236561Z method(*args, **kwargs) 2025-12-04T11:29:17.7236779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7237007Z with policy(): 2025-12-04T11:29:17.7237219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7237450Z raise RuntimeError(msg) 2025-12-04T11:29:17.7237867Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7238243Z 2025-12-04T11:29:17.7238320Z To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7238658Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7238920Z 2025-12-04T11:29:17.7239008Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7239198Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:29:17.7239359Z ============================== 1 failed in 21.89s ============================== 2025-12-04T11:29:17.7239493Z Got exit code 1 2025-12-04T11:29:17.7239662Z Retrying single test... 2025-12-04T11:29:17.7239931Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_overlap/distributed.fsdp.test_fsdp_overlap-7ccc9be6dff28e59.xml 2025-12-04T11:29:17.7240222Z ============================= test session starts ============================== 2025-12-04T11:29:17.7240436Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:29:17.7240624Z cachedir: .pytest_cache 2025-12-04T11:29:17.7240849Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:29:17.7241086Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:29:17.7241204Z configfile: pytest.ini 2025-12-04T11:29:17.7241432Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:29:17.7241674Z collecting ... collected 1 item 2025-12-04T11:29:17.7241967Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda 2025-12-04T11:29:17.7242261Z Running 1 items in this shard 2025-12-04T11:29:17.7242332Z 2025-12-04T11:29:17.7242642Z distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda I1204 11:28:29.156000 127908 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 127977 2025-12-04T11:29:17.7243282Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T11:29:17.7243647Z _init_core_state( 2025-12-04T11:29:17.7244292Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T11:29:17.7244939Z _warn_cpu_init() 2025-12-04T11:29:17.7245279Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:29:17.7245665Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:29:17.7246152Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7246632Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:29:17.7247108Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7247554Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:29:17.7248049Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7248512Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:29:17.7248973Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7249469Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:29:17.7249973Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7250427Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:29:17.7250880Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7251344Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:29:17.7252008Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7252630Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:29:17.7252976Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7253561Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7254062Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:29:17.7254421Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7254863Z [rank0]:E1204 11:28:49.438000 127977 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:29:17.7255102Z dist init r=0, world=1 2025-12-04T11:29:17.7255164Z 2025-12-04T11:29:17.7255201Z rank0: 2025-12-04T11:29:17.7255405Z e1: {'cpu_iter': 0.0008104815999999459, 'cpu_wait': 1.8289099999790892e-05, 'gpu_compute': 0.01727570001967251, 'gpu_total': 0.33282740116119386} 2025-12-04T11:29:17.7255736Z e2: {'cpu_iter': 0.0019234882000001008, 'cpu_wait': 1.888799999996138e-05, 'gpu_compute': 0.039115499798208477, 'gpu_total': 0.7730338037014007} 2025-12-04T11:29:17.7256057Z e3: {'cpu_iter': 0.001446042500000111, 'cpu_wait': 0.39556908509999966, 'gpu_compute': 397.13989410400393, 'gpu_total': 397.4113311767578} 2025-12-04T11:29:17.7256370Z e4: {'cpu_iter': 0.0032382339999998066, 'cpu_wait': 0.7502928507999993, 'gpu_compute': 397.148299407959, 'gpu_total': 397.53850708007815} 2025-12-04T11:29:17.7256888Z [rank0]:[W1204 11:28:49.605995528 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:29:17.7257298Z FAILED [21.6340s] [100%] 2025-12-04T11:29:17.7257364Z 2025-12-04T11:29:17.7257422Z =================================== FAILURES =================================== 2025-12-04T11:29:17.7257618Z _________ TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda _________ 2025-12-04T11:29:17.7257799Z Traceback (most recent call last): 2025-12-04T11:29:17.7258079Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:29:17.7258322Z self._join_processes(fn) 2025-12-04T11:29:17.7258566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:29:17.7258828Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:29:17.7259095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:29:17.7259353Z raise RuntimeError(error) 2025-12-04T11:29:17.7259502Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:29:17.7259705Z Traceback (most recent call last): 2025-12-04T11:29:17.7259944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7260197Z getattr(self, test_name)() 2025-12-04T11:29:17.7260428Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7260660Z fn() 2025-12-04T11:29:17.7260863Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7261093Z method(*args, **kwargs) 2025-12-04T11:29:17.7261315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7261544Z method(*args, **kwargs) 2025-12-04T11:29:17.7261761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7261990Z with policy(): 2025-12-04T11:29:17.7262202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7262435Z raise RuntimeError(msg) 2025-12-04T11:29:17.7262847Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7263224Z 2025-12-04T11:29:17.7263298Z To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7263688Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7263947Z 2025-12-04T11:29:17.7264032Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7264158Z 2025-12-04T11:29:17.7264160Z 2025-12-04T11:29:17.7264235Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:29:17.7264435Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:29:17.7264804Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_overlap/distributed.fsdp.test_fsdp_overlap-7ccc9be6dff28e59.xml - 2025-12-04T11:29:17.7265145Z =========================== short test summary info ============================ 2025-12-04T11:29:17.7265491Z FAILED [21.6340s] distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:29:17.7265814Z Traceback (most recent call last): 2025-12-04T11:29:17.7266057Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7266297Z getattr(self, test_name)() 2025-12-04T11:29:17.7266529Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7266793Z fn() 2025-12-04T11:29:17.7266993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7267221Z method(*args, **kwargs) 2025-12-04T11:29:17.7267438Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7267664Z method(*args, **kwargs) 2025-12-04T11:29:17.7267881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7268104Z with policy(): 2025-12-04T11:29:17.7268315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7268561Z raise RuntimeError(msg) 2025-12-04T11:29:17.7268968Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7269352Z 2025-12-04T11:29:17.7269425Z To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7269796Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7270059Z 2025-12-04T11:29:17.7270148Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7270334Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:29:17.7270489Z ============================== 1 failed in 21.79s ============================== 2025-12-04T11:29:17.7270618Z Got exit code 1 2025-12-04T11:29:17.7270713Z Retrying single test... 2025-12-04T11:29:17.7270975Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_overlap/distributed.fsdp.test_fsdp_overlap-7a5f0417d5f82ba7.xml 2025-12-04T11:29:17.7271268Z ============================= test session starts ============================== 2025-12-04T11:29:17.7271478Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:29:17.7271666Z cachedir: .pytest_cache 2025-12-04T11:29:17.7271929Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:29:17.7272167Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:29:17.7272283Z configfile: pytest.ini 2025-12-04T11:29:17.7272512Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:29:17.7272752Z collecting ... collected 1 item 2025-12-04T11:29:17.7273041Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda 2025-12-04T11:29:17.7273338Z Running 1 items in this shard 2025-12-04T11:29:17.7273408Z 2025-12-04T11:29:17.7273715Z distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda I1204 11:28:53.304000 128060 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 128129 2025-12-04T11:29:17.7274351Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T11:29:17.7274718Z _init_core_state( 2025-12-04T11:29:17.7275350Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T11:29:17.7276023Z _warn_cpu_init() 2025-12-04T11:29:17.7276224Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:29:17.7276566Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:29:17.7277051Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7277528Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:29:17.7278004Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7278451Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:29:17.7278892Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7279354Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:29:17.7279850Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7280312Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:29:17.7280773Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7281258Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:29:17.7281717Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7282178Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:29:17.7282834Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7283450Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:29:17.7283799Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7284382Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7284911Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:29:17.7285271Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7285687Z [rank0]:E1204 11:29:13.576000 128129 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:29:17.7285931Z dist init r=0, world=1 2025-12-04T11:29:17.7285994Z 2025-12-04T11:29:17.7286030Z rank0: 2025-12-04T11:29:17.7286233Z e1: {'cpu_iter': 0.0007750598999999525, 'cpu_wait': 2.6316000000115024e-05, 'gpu_compute': 0.017363999970257282, 'gpu_total': 0.3225758999586105} 2025-12-04T11:29:17.7286564Z e2: {'cpu_iter': 0.0018774933000003102, 'cpu_wait': 1.899099999977949e-05, 'gpu_compute': 0.037371299928054214, 'gpu_total': 0.7534022033214569} 2025-12-04T11:29:17.7286880Z e3: {'cpu_iter': 0.0013932615000001648, 'cpu_wait': 0.3960319146999998, 'gpu_compute': 397.6953666687012, 'gpu_total': 397.93882446289064} 2025-12-04T11:29:17.7287193Z e4: {'cpu_iter': 0.0031471187000001065, 'cpu_wait': 0.7516728245000003, 'gpu_compute': 397.7338554382324, 'gpu_total': 398.10332946777345} 2025-12-04T11:29:17.7287716Z [rank0]:[W1204 11:29:13.757007657 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:29:17.7288123Z FAILED [21.6327s] [100%] 2025-12-04T11:29:17.7288187Z 2025-12-04T11:29:17.7288244Z =================================== FAILURES =================================== 2025-12-04T11:29:17.7288437Z _________ TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda _________ 2025-12-04T11:29:17.7288619Z Traceback (most recent call last): 2025-12-04T11:29:17.7288864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:29:17.7289111Z self._join_processes(fn) 2025-12-04T11:29:17.7289357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:29:17.7289653Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:29:17.7289957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:29:17.7290215Z raise RuntimeError(error) 2025-12-04T11:29:17.7290363Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:29:17.7290523Z Traceback (most recent call last): 2025-12-04T11:29:17.7290760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7291002Z getattr(self, test_name)() 2025-12-04T11:29:17.7291236Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7291466Z fn() 2025-12-04T11:29:17.7291666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7291895Z method(*args, **kwargs) 2025-12-04T11:29:17.7292118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7292345Z method(*args, **kwargs) 2025-12-04T11:29:17.7292561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7292786Z with policy(): 2025-12-04T11:29:17.7292995Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7293224Z raise RuntimeError(msg) 2025-12-04T11:29:17.7293666Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7294041Z 2025-12-04T11:29:17.7294117Z To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7294451Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7294711Z 2025-12-04T11:29:17.7294798Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7294922Z 2025-12-04T11:29:17.7294924Z 2025-12-04T11:29:17.7294998Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:29:17.7295194Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:29:17.7295563Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_overlap/distributed.fsdp.test_fsdp_overlap-7a5f0417d5f82ba7.xml - 2025-12-04T11:29:17.7295900Z =========================== short test summary info ============================ 2025-12-04T11:29:17.7296247Z FAILED [21.6327s] distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:29:17.7296570Z Traceback (most recent call last): 2025-12-04T11:29:17.7296812Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:29:17.7297054Z getattr(self, test_name)() 2025-12-04T11:29:17.7297284Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:29:17.7297513Z fn() 2025-12-04T11:29:17.7297713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7297940Z method(*args, **kwargs) 2025-12-04T11:29:17.7298157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:29:17.7298399Z method(*args, **kwargs) 2025-12-04T11:29:17.7298643Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:29:17.7298868Z with policy(): 2025-12-04T11:29:17.7299078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:29:17.7299308Z raise RuntimeError(msg) 2025-12-04T11:29:17.7299756Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1633681408 and is now 1669332992. 2025-12-04T11:29:17.7300134Z 2025-12-04T11:29:17.7300210Z To execute this test, run the following from the base repo dir: 2025-12-04T11:29:17.7300541Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_overlap.py TestForwardOverlapWorldSizeOneCUDA.test_forward_overlap_cuda 2025-12-04T11:29:17.7300797Z 2025-12-04T11:29:17.7300886Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:29:17.7301072Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:29:17.7301226Z ============================== 1 failed in 21.79s ============================== 2025-12-04T11:29:17.7301355Z Got exit code 1 2025-12-04T11:29:17.7301585Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda 2025-12-04T11:29:17.7301920Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:29:17.7302327Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_overlap/distributed.fsdp.test_fsdp_overlap-ee596d64f21d92a2.xml 2025-12-04T11:29:17.7302621Z ============================= test session starts ============================== 2025-12-04T11:29:17.7302832Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:29:17.7303023Z cachedir: .pytest_cache 2025-12-04T11:29:17.7303246Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:29:17.7303482Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:29:17.7303597Z configfile: pytest.ini 2025-12-04T11:29:17.7303823Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:29:17.7304094Z collecting ... collected 1 item / 1 deselected / 0 selected 2025-12-04T11:29:17.7304250Z stepcurrent: skipping 1 already run items. 2025-12-04T11:29:17.7304378Z Running 0 items in this shard 2025-12-04T11:29:17.7304450Z 2025-12-04T11:29:17.7304694Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_overlap/distributed.fsdp.test_fsdp_overlap-ee596d64f21d92a2.xml - 2025-12-04T11:29:17.7305033Z ============================ 1 deselected in 0.00s ============================= 2025-12-04T11:29:17.7305334Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_overlap.py::TestForwardOverlapWorldSizeOneCUDA::test_forward_overlap_cuda'] 2025-12-04T11:29:17.7305572Z 2025-12-04T11:29:17.7305766Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_overlap 1/1 (test/test-reports/distributed.fsdp.test_fsdp_overlap_1.1_576f9da47548da7f_.log) 2025-12-04T11:29:17.7305994Z 2025-12-04T11:29:17.7306121Z Finished distributed/fsdp/test_fsdp_overlap 1/1 ... [2025-12-04 11:29:17.719179][5225798.698215014], took 1.24min 2025-12-04T11:29:17.7306549Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:29:17.7306938Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:29:17.7307153Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:29:17.7307363Z Uploading artifacts took 0.00 seconds 2025-12-04T11:29:17.7307500Z distributed/fsdp/test_fsdp_overlap 1/1 failed! 2025-12-04T11:29:17.7307695Z Running distributed/test_functional_api 1/1 ... [2025-12-04 11:29:17.721807][5225798.7008489] 2025-12-04T11:29:17.7307885Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:29:17.7308281Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_functional_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:29:17.722015] 2025-12-04T11:31:06.3428328Z 2025-12-04T11:31:06.3431754Z distributed/test_functional_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_functional_api_1.1_4cf011bcbb1b8894_.log 2025-12-04T11:31:06.3435804Z Running 11 items in this shard: test/distributed/test_functional_api.py::TestMetaCollectives::test_all_reduce, test/distributed/test_functional_api.py::TestMakeFx::test_all_reduce_tracing, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_gather_into_tensor_coalesced_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_1d_input_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_split_sizes_none_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_with_dce_code_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_with_fakepg_cuda, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_permute_tensor_with_sub_group_cuda, test/distributed/test_functional_api.py::TestFunctionalAutogradWithDistributedBackendCUDA::test_all_to_all_single_cuda 2025-12-04T11:31:06.3440221Z 2025-12-04T11:31:06.3440463Z Finished distributed/test_functional_api 1/1 ... [2025-12-04 11:31:06.342511][5225907.321548056], took 1.81min 2025-12-04T11:31:06.3441281Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:31:06.3454977Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:31:06.3455647Z Running distributed/_composable/test_composability/test_2d_composability 1/1 ... [2025-12-04 11:31:06.345430][5225907.324471418] 2025-12-04T11:31:06.3455975Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:31:06.3457999Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/test_composability/test_2d_composability.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:31:06.345620] 2025-12-04T11:33:37.8860707Z 2025-12-04T11:33:37.8861819Z distributed/_composable/test_composability/test_2d_composability 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_composability.test_2d_composability_1.1_198d128d8c5f883d_.log 2025-12-04T11:33:37.8870943Z Running 18 items in this shard: test/distributed/_composable/test_composability/test_2d_composability.py::TestFullyShard2DTraining::test_tp_with_fsdp_offloading, test/distributed/_composable/test_composability/test_2d_composability.py::TestFullyShard2DTraining::test_train_parity_2d_mlp, test/distributed/_composable/test_composability/test_2d_composability.py::TestFullyShard2DTraining::test_train_parity_2d_transformer, test/distributed/_composable/test_composability/test_2d_composability.py::TestFullyShard2DTraining::test_train_parity_2d_transformer_checkpoint_resume, test/distributed/_composable/test_composability/test_2d_composability.py::TestFullyShard2DStateDict::test_fully_shard_tp_2d_set_full_state_dict, test/distributed/_composable/test_composability/test_2d_composability.py::Test2dFSDP1ParallelIntegration::test_2d_ddp_integration_functionality, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelTraining::test_2d_e2e_training_default, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelTraining::test_2d_e2e_training_not_use_orig_params, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelTraining::test_2d_e2e_training_use_orig_params, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelTraining::test_2d_fsdp_state_enable_extension, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelStateDict::test_2d_load_state_dict_is_even_sharded_model_False, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelStateDict::test_2d_load_state_dict_is_even_sharded_model_True, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelStateDict::test_2d_optim_state_dict_is_even_sharded_model_False, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelStateDict::test_2d_optim_state_dict_is_even_sharded_model_True, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelStateDict::test_2d_state_dict_is_even_sharded_model_False, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelStateDict::test_2d_state_dict_is_even_sharded_model_True, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelStateDict::test_fsdp1_tp_2d_set_full_state_dict, test/distributed/_composable/test_composability/test_2d_composability.py::TestNew2dParallelStateDict::test_fsdp_2d_extension 2025-12-04T11:33:37.8877658Z 2025-12-04T11:33:37.8877936Z Finished distributed/_composable/test_composability/test_2d_composability 1/1 ... [2025-12-04 11:33:37.885864][5226058.864901258], took 2.53min 2025-12-04T11:33:37.8878585Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:33:37.8884455Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:33:37.8886916Z Running distributed/fsdp/test_fsdp_optim_state 1/1 ... [2025-12-04 11:33:37.888592][5226058.867633463] 2025-12-04T11:33:37.8887198Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:33:37.8889096Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_optim_state.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:37.888791] 2025-12-04T11:41:42.2205552Z 2025-12-04T11:41:42.2206765Z distributed/fsdp/test_fsdp_optim_state 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_optim_state_1.1_18ae9281748a32e9_.log 2025-12-04T11:41:42.2232274Z Running 60 items in this shard: test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_compatible_with_trec, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_flatten_sharded_optim_state_dict_nested, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_flatten_sharded_optim_state_dict_transformer, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_full_optim_state_dict_keys, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_full_optim_state_dict_nested_invalid, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_interface_arguments, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_no_grad, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_input_warning, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type0_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_False_rank0_only_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_dict_nested_state_dict_type1_use_multiple_param_groups_True_rank0_only_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_optim_state_without_param_groups, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type0_use_multiple_param_groups_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type0_use_multiple_param_groups_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type1_use_multiple_param_groups_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_ids_state_dict_type1_use_multiple_param_groups_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_rekey_optim_state_dict_to_names, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_save_load_without_0th_param_state_state_dict_type0, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_save_load_without_0th_param_state_state_dict_type1, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_halve_world_size, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_scatter_full_optim_state_dict_transformer, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_halve_world_size, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_False_wrap_alt_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_False_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_nested_use_multiple_param_groups_True_wrap_alt_True_use_diff_optim_inputs_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_transformer, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_unmanaged_params_state_dict_type0_add_to_fsdp_module_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_unmanaged_params_state_dict_type0_add_to_fsdp_module_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_unmanaged_params_state_dict_type1_add_to_fsdp_module_False, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_shard_full_optim_state_dict_unmanaged_params_state_dict_type1_add_to_fsdp_module_True, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_state_dict_with_none_tensor_state, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_use_orig_params, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_with_empty_optimizer_state, test/distributed/fsdp/test_fsdp_optim_state.py::TestFSDPOptimState::test_with_no_shard 2025-12-04T11:41:42.2245828Z 2025-12-04T11:41:42.2245971Z Finished distributed/fsdp/test_fsdp_optim_state 1/1 ... [2025-12-04 11:41:42.221991][5226543.201028217], took 8.07min 2025-12-04T11:41:42.2246422Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:41:42.2246817Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:41:42.2247060Z Running distributed/tensor/test_view_ops 1/1 ... [2025-12-04 11:41:42.224203][5226543.203244639] 2025-12-04T11:41:42.2247262Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:41:42.2247671Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/tensor/test_view_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:41:42.224416] 2025-12-04T11:47:01.3573656Z 2025-12-04T11:47:01.3574950Z distributed/tensor/test_view_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_view_ops_1.1_fda6ad413b66c16d_.log 2025-12-04T11:47:01.3580852Z Running 20 items in this shard: test/distributed/tensor/test_view_ops.py::TestViewOps::test_complex_view_ops, test/distributed/tensor/test_view_ops.py::TestViewOps::test_dtensor_view_op_uneven, test/distributed/tensor/test_view_ops.py::TestViewOps::test_illegal_views, test/distributed/tensor/test_view_ops.py::TestViewOps::test_squeeze_, test/distributed/tensor/test_view_ops.py::TestViewOps::test_storage_offset_shard_dim0_slice_dim1, test/distributed/tensor/test_view_ops.py::TestViewOps::test_storage_offset_shard_dim1_slice_dim0, test/distributed/tensor/test_view_ops.py::TestViewOps::test_storage_offset_slice, test/distributed/tensor/test_view_ops.py::TestViewOps::test_view_groups, test/distributed/tensor/test_view_ops.py::TestViewOps::test_view_ops, test/distributed/tensor/test_view_ops.py::TestViewOps::test_view_redistribution, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_complex_view_ops, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_dtensor_view_op_uneven, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_illegal_views, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_squeeze_, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_storage_offset_shard_dim0_slice_dim1, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_storage_offset_shard_dim1_slice_dim0, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_storage_offset_slice, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_view_groups, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_view_ops, test/distributed/tensor/test_view_ops.py::TestViewOpsWithLocalTensor::test_view_redistribution 2025-12-04T11:47:01.3583484Z 2025-12-04T11:47:01.3583621Z Finished distributed/tensor/test_view_ops 1/1 ... [2025-12-04 11:47:01.357131][5226862.336169338], took 5.32min 2025-12-04T11:47:01.3584108Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:47:01.3593848Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:47:01.3596543Z Running distributed/fsdp/test_fsdp_state_dict 2/2 ... [2025-12-04 11:47:01.359513][5226862.338554008] 2025-12-04T11:47:01.3596765Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:47:01.3598135Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_state_dict.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:47:01.359693] 2025-12-04T11:55:04.3034348Z 2025-12-04T11:55:04.3034931Z distributed/fsdp/test_fsdp_state_dict 2/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_state_dict_2.2_1e2e4278ab69a39a_.log 2025-12-04T11:55:04.3061746Z Running 101 items in this shard: test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_basic_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_fp16_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_local_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_sharded_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload0_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_False_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_buffers_save_and_load_state_dict_state_dict_type_state_dict_cpu_offload1_mixed_precision_True_state_dict_rank0_and_offload_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_keys_state_dict_type_sharded_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_after_wrap_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_both_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_dest_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_sharded_state_dict_checkpoint_wrap_source_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_source_after_wrap_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_fsdp_state_dict_with_activation_checkpoint_state_dict_type_state_dict_checkpoint_wrap_source_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_local_state_dict_with_empty_ranks, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_local_state_dict_mixed_precision_True_state_dict_rank0_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_sharded_state_dict_mixed_precision_True_state_dict_rank0_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_False_state_dict_rank0_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_save_and_load_after_forward_state_dict_state_dict_type_state_dict_mixed_precision_False_state_dict_rank0_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_sharded_load_multi_backend_pg, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_shared_module_and_shared_parameter, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_False_fsdp_root_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_load_into_local_module_state_dict_type_sharded_state_dict_state_dict_rank0_and_offload_True_fsdp_root_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_load_into_local_module_state_dict_type_state_dict_state_dict_rank0_and_offload_True_fsdp_root_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_rank0_offload_save_load_flow_use_orig_params_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_save_load_flow_state_dict_type_local_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_save_load_flow_state_dict_type_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_skip_module_state_dict_type_local_state_dict_double_nest_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_skip_module_state_dict_type_sharded_state_dict_double_nest_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_skip_module_state_dict_type_state_dict_double_nest_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_False_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_False_ignore_inner_True_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_sharded_state_dict_prefix_True_ignore_inner_True_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_False_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_False_ignore_inner_True_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_False_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_False_mixed_precision_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_ignored_modules_state_dict_type_state_dict_prefix_True_ignore_inner_True_mixed_precision_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_manual_ac_wrapper_state_dict_type_sharded_state_dict_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_manual_ac_wrapper_state_dict_type_sharded_state_dict_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_manual_ac_wrapper_state_dict_type_state_dict_rank0_only_and_offload_False, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_manual_ac_wrapper_state_dict_type_state_dict_rank0_only_and_offload_True, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_shared_parameters_state_dict_type_local_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_state_dict_with_shared_parameters_state_dict_type_sharded_state_dict, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_world_size_one, test/distributed/fsdp/test_fsdp_state_dict.py::TestFSDPStateDict::test_wrong_state_dict_config 2025-12-04T11:55:04.3087224Z 2025-12-04T11:55:04.3087363Z Finished distributed/fsdp/test_fsdp_state_dict 2/2 ... [2025-12-04 11:55:04.303816][5227345.282855816], took 8.05min 2025-12-04T11:55:04.3087814Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:55:04.3088208Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:55:04.3088429Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:55:04.3088609Z Uploading artifacts took 0.00 seconds 2025-12-04T11:55:04.3088806Z Running distributed/fsdp/test_fsdp_exec_order 1/1 ... [2025-12-04 11:55:04.306031][5227345.285072508] 2025-12-04T11:55:04.3089008Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:55:04.3089415Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_exec_order.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:55:04.306215] 2025-12-04T11:59:23.3237598Z 2025-12-04T11:59:23.3238478Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_exec_order 1/1 (test/test-reports/distributed.fsdp.test_fsdp_exec_order_1.1_b71b860b1a78e6ee_.log) 2025-12-04T11:59:23.3239128Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-6899852671441b60.xml 2025-12-04T11:59:23.3239645Z ============================= test session starts ============================== 2025-12-04T11:59:23.3239931Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3240184Z cachedir: .pytest_cache 2025-12-04T11:59:23.3240486Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3240825Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3241027Z configfile: pytest.ini 2025-12-04T11:59:23.3241346Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3241701Z collecting ... collected 8 items 2025-12-04T11:59:23.3241954Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T11:59:23.3243972Z Running 8 items in this shard: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.3246698Z 2025-12-04T11:59:23.3247114Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:55:06.087000 205911 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 205980 2025-12-04T11:59:23.3247762Z I1204 11:55:06.088000 205911 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 205981 2025-12-04T11:59:23.3248201Z I1204 11:55:06.088000 205911 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 205982 2025-12-04T11:59:23.3248631Z I1204 11:55:06.089000 205911 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 205983 2025-12-04T11:59:23.3249544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3250206Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3250808Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3251593Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3252191Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3252790Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3253382Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3253978Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3254224Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3254586Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3255112Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3255617Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3256128Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3256593Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3257090Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3257668Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3258197Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3258690Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3259174Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3259689Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3260149Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3260628Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3261461Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3262101Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3262461Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3263086Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3263614Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3263992Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3264817Z [rank1]:E1204 11:55:11.296000 205981 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3265066Z dist init r=1, world=4 2025-12-04T11:59:23.3265273Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3265610Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3266099Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3266585Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3267113Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3267566Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3268008Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3268475Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3268947Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3269409Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3269930Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3270380Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3270838Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3271338Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3272014Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:59:23.3272644Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3272995Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3273604Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3274125Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3274494Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3274911Z [rank0]:E1204 11:55:11.317000 205980 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3275153Z dist init r=0, world=4 2025-12-04T11:59:23.3275355Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3275698Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3276186Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3276699Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3277183Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3277632Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3278074Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3278538Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3279005Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3279467Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3280005Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3280520Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3280979Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3281456Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3282139Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:59:23.3282773Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3283121Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3283727Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3284247Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3284612Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3285030Z [rank2]:E1204 11:55:11.343000 205982 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3285272Z dist init r=2, world=4 2025-12-04T11:59:23.3285476Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3285816Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3286351Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3286832Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3287312Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3287762Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3288204Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3288672Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3289138Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3289676Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3290142Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3290599Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3291054Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3291520Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3292194Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3292828Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3293179Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3293788Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3294309Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3294678Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3295093Z [rank3]:E1204 11:55:11.386000 205983 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3295363Z dist init r=3, world=4 2025-12-04T11:59:23.3295471Z FAILED [6.1141s] [ 12%] 2025-12-04T11:59:23.3295536Z 2025-12-04T11:59:23.3295598Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3295801Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __ 2025-12-04T11:59:23.3295989Z Traceback (most recent call last): 2025-12-04T11:59:23.3296238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3296489Z self._join_processes(fn) 2025-12-04T11:59:23.3296738Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3297006Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3297276Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3297541Z raise RuntimeError(error) 2025-12-04T11:59:23.3297694Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3297860Z Traceback (most recent call last): 2025-12-04T11:59:23.3298102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3298347Z getattr(self, test_name)() 2025-12-04T11:59:23.3298579Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3298845Z fn() 2025-12-04T11:59:23.3299051Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3299286Z method(*args, **kwargs) 2025-12-04T11:59:23.3299510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3299781Z method(*args, **kwargs) 2025-12-04T11:59:23.3300004Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3300235Z with policy(): 2025-12-04T11:59:23.3300451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3300686Z raise RuntimeError(msg) 2025-12-04T11:59:23.3301117Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3301516Z 2025-12-04T11:59:23.3301595Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3301958Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3302240Z 2025-12-04T11:59:23.3302332Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3302459Z 2025-12-04T11:59:23.3302461Z 2025-12-04T11:59:23.3302543Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3302745Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3303123Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-6899852671441b60.xml - 2025-12-04T11:59:23.3303469Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3303874Z FAILED [6.1141s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3304216Z Traceback (most recent call last): 2025-12-04T11:59:23.3304460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3304708Z getattr(self, test_name)() 2025-12-04T11:59:23.3304943Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3305178Z fn() 2025-12-04T11:59:23.3305382Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3305614Z method(*args, **kwargs) 2025-12-04T11:59:23.3305834Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3306064Z method(*args, **kwargs) 2025-12-04T11:59:23.3306287Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3306518Z with policy(): 2025-12-04T11:59:23.3306731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3306965Z raise RuntimeError(msg) 2025-12-04T11:59:23.3307390Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3307815Z 2025-12-04T11:59:23.3307889Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3308247Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3308528Z 2025-12-04T11:59:23.3308618Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3308809Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3308969Z ============================== 1 failed in 6.12s =============================== 2025-12-04T11:59:23.3309102Z Got exit code 1 2025-12-04T11:59:23.3309200Z Retrying single test... 2025-12-04T11:59:23.3309475Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-ffa467cb0a1c4f9d.xml 2025-12-04T11:59:23.3309814Z ============================= test session starts ============================== 2025-12-04T11:59:23.3310030Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3310221Z cachedir: .pytest_cache 2025-12-04T11:59:23.3310446Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3310688Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3310809Z configfile: pytest.ini 2025-12-04T11:59:23.3311039Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3311308Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3311652Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3311970Z Running 1 items in this shard 2025-12-04T11:59:23.3312046Z 2025-12-04T11:59:23.3312370Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:55:14.889000 206289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 206358 2025-12-04T11:59:23.3312928Z I1204 11:55:14.890000 206289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 206359 2025-12-04T11:59:23.3313275Z I1204 11:55:14.890000 206289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 206360 2025-12-04T11:59:23.3313618Z I1204 11:55:14.891000 206289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 206361 2025-12-04T11:59:23.3314307Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3314899Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3315481Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3316064Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3316646Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3317261Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3317842Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3318427Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3318667Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3319013Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3319505Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3320033Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3320514Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3320963Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3321413Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3321877Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3322367Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3322835Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3323299Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3323753Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3324208Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3324674Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3325355Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:59:23.3326021Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3326372Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3326981Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3327513Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3327884Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3328304Z [rank2]:E1204 11:55:20.153000 206360 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3328547Z dist init r=2, world=4 2025-12-04T11:59:23.3328750Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3329088Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3329613Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3330092Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3330571Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3331023Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3331493Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3331964Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3332429Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3332894Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3333358Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3333814Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3334276Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3334743Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3335415Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3336076Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3336427Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3337031Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3337550Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3337916Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3338333Z [rank1]:E1204 11:55:20.156000 206359 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3338576Z dist init r=1, world=4 2025-12-04T11:59:23.3338778Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3339114Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3339648Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3340129Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3340609Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3341089Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3341532Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3341996Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3342466Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3342929Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3343392Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3343843Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3344296Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3344795Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3345474Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3346102Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3346454Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3347062Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3347587Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3347954Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3348370Z [rank3]:E1204 11:55:20.163000 206361 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3348613Z dist init r=3, world=4 2025-12-04T11:59:23.3348819Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3349161Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3349692Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3350215Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3350700Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3351147Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3351590Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3352061Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3352525Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3352987Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3353451Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3353935Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3354390Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3354857Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3355531Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:59:23.3356170Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3356519Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3357125Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3357649Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3358014Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3358427Z [rank0]:E1204 11:55:20.171000 206358 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3358670Z dist init r=0, world=4 2025-12-04T11:59:23.3358773Z FAILED [6.2150s] [100%] 2025-12-04T11:59:23.3358837Z 2025-12-04T11:59:23.3358897Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3359098Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __ 2025-12-04T11:59:23.3359307Z Traceback (most recent call last): 2025-12-04T11:59:23.3359555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3359850Z self._join_processes(fn) 2025-12-04T11:59:23.3360101Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3360366Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3360637Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3360900Z raise RuntimeError(error) 2025-12-04T11:59:23.3361053Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3361218Z Traceback (most recent call last): 2025-12-04T11:59:23.3361462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3361708Z getattr(self, test_name)() 2025-12-04T11:59:23.3361944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3362179Z fn() 2025-12-04T11:59:23.3362384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3362617Z method(*args, **kwargs) 2025-12-04T11:59:23.3362881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3363112Z method(*args, **kwargs) 2025-12-04T11:59:23.3363333Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3363563Z with policy(): 2025-12-04T11:59:23.3363781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3364015Z raise RuntimeError(msg) 2025-12-04T11:59:23.3364445Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3364838Z 2025-12-04T11:59:23.3364917Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3365287Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3365567Z 2025-12-04T11:59:23.3365658Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3365782Z 2025-12-04T11:59:23.3365784Z 2025-12-04T11:59:23.3365866Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3366067Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3366448Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-ffa467cb0a1c4f9d.xml - 2025-12-04T11:59:23.3366797Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3367159Z FAILED [6.2150s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3367503Z Traceback (most recent call last): 2025-12-04T11:59:23.3367749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3367996Z getattr(self, test_name)() 2025-12-04T11:59:23.3368266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3368505Z fn() 2025-12-04T11:59:23.3368709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3368943Z method(*args, **kwargs) 2025-12-04T11:59:23.3369165Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3369400Z method(*args, **kwargs) 2025-12-04T11:59:23.3369659Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3369890Z with policy(): 2025-12-04T11:59:23.3370105Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3370340Z raise RuntimeError(msg) 2025-12-04T11:59:23.3370777Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3371171Z 2025-12-04T11:59:23.3371246Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3371601Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3371915Z 2025-12-04T11:59:23.3372003Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3372195Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3372364Z ======================= 1 failed, 7 deselected in 6.22s ======================== 2025-12-04T11:59:23.3372506Z Got exit code 1 2025-12-04T11:59:23.3372610Z Retrying single test... 2025-12-04T11:59:23.3372881Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-7749d1f1d0257bb2.xml 2025-12-04T11:59:23.3373182Z ============================= test session starts ============================== 2025-12-04T11:59:23.3373399Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3373588Z cachedir: .pytest_cache 2025-12-04T11:59:23.3373819Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3374063Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3374184Z configfile: pytest.ini 2025-12-04T11:59:23.3374417Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3374692Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3375042Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3375361Z Running 1 items in this shard 2025-12-04T11:59:23.3375437Z 2025-12-04T11:59:23.3375769Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:55:23.461000 206667 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 206736 2025-12-04T11:59:23.3376305Z I1204 11:55:23.462000 206667 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 206737 2025-12-04T11:59:23.3376662Z I1204 11:55:23.462000 206667 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 206738 2025-12-04T11:59:23.3377043Z I1204 11:55:23.463000 206667 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 206739 2025-12-04T11:59:23.3377751Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3378351Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3378948Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3379544Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3380190Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3380788Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3381414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3382013Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3382258Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3382607Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3383108Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3383605Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3384095Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3384551Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3385000Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3385473Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3385948Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3386423Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3386927Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3387387Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3387852Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3388330Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3389019Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3389704Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3390065Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3391872Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3392407Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3392783Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3393209Z [rank1]:E1204 11:55:28.604000 206737 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3393456Z dist init r=1, world=4 2025-12-04T11:59:23.3393661Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3394009Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3394507Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3394997Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3395484Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3395939Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3396396Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3396867Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3397371Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3397841Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3398308Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3398769Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3399230Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3399750Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3400435Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:59:23.3401109Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3416117Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3416778Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3417315Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3417685Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3418101Z [rank0]:E1204 11:55:28.605000 206736 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3418348Z dist init r=0, world=4 2025-12-04T11:59:23.3418558Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3418904Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3419401Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3419942Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3420418Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3420869Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3421373Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3421836Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3422299Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3422757Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3423218Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3423684Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3424153Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3424631Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3425313Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:59:23.3425988Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3426347Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3426960Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3427484Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3427856Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3428277Z [rank2]:E1204 11:55:28.652000 206738 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3428526Z dist init r=2, world=4 2025-12-04T11:59:23.3428737Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3429080Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3429623Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3430120Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3430609Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3431109Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3431557Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3432028Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3432499Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3432968Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3433441Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3433902Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3434367Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3434872Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3435557Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3436200Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3436554Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3437162Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3437691Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3438069Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3438491Z [rank3]:E1204 11:55:28.655000 206739 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3438742Z dist init r=3, world=4 2025-12-04T11:59:23.3438851Z FAILED [6.1151s] [100%] 2025-12-04T11:59:23.3438918Z 2025-12-04T11:59:23.3438985Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3439194Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __ 2025-12-04T11:59:23.3439388Z Traceback (most recent call last): 2025-12-04T11:59:23.3439683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3439942Z self._join_processes(fn) 2025-12-04T11:59:23.3440228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3440509Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3440787Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3441057Z raise RuntimeError(error) 2025-12-04T11:59:23.3441220Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.3441392Z Traceback (most recent call last): 2025-12-04T11:59:23.3441644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3441898Z getattr(self, test_name)() 2025-12-04T11:59:23.3442138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3442379Z fn() 2025-12-04T11:59:23.3442593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3442833Z method(*args, **kwargs) 2025-12-04T11:59:23.3443062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3443300Z method(*args, **kwargs) 2025-12-04T11:59:23.3443527Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3443795Z with policy(): 2025-12-04T11:59:23.3444021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3444261Z raise RuntimeError(msg) 2025-12-04T11:59:23.3444698Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:59:23.3445104Z 2025-12-04T11:59:23.3445184Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3445550Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3445839Z 2025-12-04T11:59:23.3445931Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3446071Z 2025-12-04T11:59:23.3446134Z Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3446285Z Traceback (most recent call last): 2025-12-04T11:59:23.3446539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3446791Z getattr(self, test_name)() 2025-12-04T11:59:23.3447040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3447280Z fn() 2025-12-04T11:59:23.3447489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3447728Z method(*args, **kwargs) 2025-12-04T11:59:23.3447955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3448193Z method(*args, **kwargs) 2025-12-04T11:59:23.3448423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3448655Z with policy(): 2025-12-04T11:59:23.3448874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3449114Z raise RuntimeError(msg) 2025-12-04T11:59:23.3449626Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3450028Z 2025-12-04T11:59:23.3450107Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3450468Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3450756Z 2025-12-04T11:59:23.3450846Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3450978Z 2025-12-04T11:59:23.3450980Z 2025-12-04T11:59:23.3451061Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3451274Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3451670Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-7749d1f1d0257bb2.xml - 2025-12-04T11:59:23.3452026Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3452401Z FAILED [6.1151s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.3452782Z Traceback (most recent call last): 2025-12-04T11:59:23.3453035Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3453288Z getattr(self, test_name)() 2025-12-04T11:59:23.3453528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3453769Z fn() 2025-12-04T11:59:23.3453979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3454217Z method(*args, **kwargs) 2025-12-04T11:59:23.3454443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3454680Z method(*args, **kwargs) 2025-12-04T11:59:23.3454903Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3455137Z with policy(): 2025-12-04T11:59:23.3455350Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3455583Z raise RuntimeError(msg) 2025-12-04T11:59:23.3456015Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:59:23.3456409Z 2025-12-04T11:59:23.3456484Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3456841Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3457123Z 2025-12-04T11:59:23.3457211Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3457343Z 2025-12-04T11:59:23.3457402Z Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3457544Z Traceback (most recent call last): 2025-12-04T11:59:23.3457786Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3458033Z getattr(self, test_name)() 2025-12-04T11:59:23.3458295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3458532Z fn() 2025-12-04T11:59:23.3458731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3458958Z method(*args, **kwargs) 2025-12-04T11:59:23.3459180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3459415Z method(*args, **kwargs) 2025-12-04T11:59:23.3459695Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3459923Z with policy(): 2025-12-04T11:59:23.3460136Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3460369Z raise RuntimeError(msg) 2025-12-04T11:59:23.3460797Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3461188Z 2025-12-04T11:59:23.3461261Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3461613Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3461925Z 2025-12-04T11:59:23.3462016Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3462209Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3462379Z ======================= 1 failed, 7 deselected in 6.13s ======================== 2025-12-04T11:59:23.3462517Z Got exit code 1 2025-12-04T11:59:23.3462778Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda 2025-12-04T11:59:23.3463135Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:59:23.3463508Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-a7122a8fd02eb43a.xml 2025-12-04T11:59:23.3463815Z ============================= test session starts ============================== 2025-12-04T11:59:23.3464037Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3464231Z cachedir: .pytest_cache 2025-12-04T11:59:23.3464459Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3464704Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3464825Z configfile: pytest.ini 2025-12-04T11:59:23.3465058Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3465332Z collecting ... collected 8 items / 1 deselected / 7 selected 2025-12-04T11:59:23.3465493Z stepcurrent: skipping 1 already run items. 2025-12-04T11:59:23.3465626Z Running 7 items in this shard 2025-12-04T11:59:23.3465698Z 2025-12-04T11:59:23.3466028Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:55:32.050000 207045 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 207114 2025-12-04T11:59:23.3466548Z I1204 11:55:32.050000 207045 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 207115 2025-12-04T11:59:23.3466894Z I1204 11:55:32.051000 207045 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 207116 2025-12-04T11:59:23.3467287Z I1204 11:55:32.051000 207045 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 207117 2025-12-04T11:59:23.3467978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3468566Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3469155Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3469796Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3470383Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3471005Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3471591Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3472182Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3472422Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3472768Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3473263Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3473756Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3474247Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3474697Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3475141Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3475611Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3476078Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3476570Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3477035Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3477489Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3477948Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3478415Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3479087Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3479759Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3480108Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3480745Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3481265Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3481629Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3482042Z [rank3]:E1204 11:55:37.234000 207117 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3482284Z dist init r=3, world=4 2025-12-04T11:59:23.3482490Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3482827Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3483317Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3483795Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3484273Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3484720Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3485162Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3485625Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3486117Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3486578Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3487040Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3487491Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3487949Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3488419Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3489088Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:59:23.3489785Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3490133Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3490735Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3491254Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3491615Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3492030Z [rank0]:E1204 11:55:37.289000 207114 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3492271Z dist init r=0, world=4 2025-12-04T11:59:23.3492474Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3492814Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3493305Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3493786Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3494265Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3494714Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3495182Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3495644Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3496104Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3496567Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3497028Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3497480Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3497935Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3498400Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3499102Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3499790Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3500138Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3500741Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3501259Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3501621Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3502034Z [rank1]:E1204 11:55:37.292000 207115 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3502274Z dist init r=1, world=4 2025-12-04T11:59:23.3502477Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3502814Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3503298Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3503779Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3504286Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3504737Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3505174Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3505639Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3506101Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3506564Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3507029Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3507479Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3507930Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3508427Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3509096Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:59:23.3509761Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3510106Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3510709Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3511230Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3511592Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3512007Z [rank2]:E1204 11:55:37.299000 207116 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3512247Z dist init r=2, world=4 2025-12-04T11:59:23.3512348Z FAILED [6.1154s] [ 14%] 2025-12-04T11:59:23.3512414Z 2025-12-04T11:59:23.3512476Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3512674Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __ 2025-12-04T11:59:23.3512861Z Traceback (most recent call last): 2025-12-04T11:59:23.3513106Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3513353Z self._join_processes(fn) 2025-12-04T11:59:23.3513627Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3513894Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3514164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3514424Z raise RuntimeError(error) 2025-12-04T11:59:23.3514576Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3514739Z Traceback (most recent call last): 2025-12-04T11:59:23.3514979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3515221Z getattr(self, test_name)() 2025-12-04T11:59:23.3515454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3515689Z fn() 2025-12-04T11:59:23.3515892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3516124Z method(*args, **kwargs) 2025-12-04T11:59:23.3516344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3516574Z method(*args, **kwargs) 2025-12-04T11:59:23.3516792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3517052Z with policy(): 2025-12-04T11:59:23.3517264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3517498Z raise RuntimeError(msg) 2025-12-04T11:59:23.3517925Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3518323Z 2025-12-04T11:59:23.3518402Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3518755Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3519036Z 2025-12-04T11:59:23.3519128Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3519256Z 2025-12-04T11:59:23.3519258Z 2025-12-04T11:59:23.3519335Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3519537Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3519957Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-a7122a8fd02eb43a.xml - 2025-12-04T11:59:23.3520304Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3520664Z FAILED [6.1154s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3521001Z Traceback (most recent call last): 2025-12-04T11:59:23.3521386Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3521632Z getattr(self, test_name)() 2025-12-04T11:59:23.3521864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3522097Z fn() 2025-12-04T11:59:23.3522333Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3522566Z method(*args, **kwargs) 2025-12-04T11:59:23.3522786Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3523015Z method(*args, **kwargs) 2025-12-04T11:59:23.3523231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3523455Z with policy(): 2025-12-04T11:59:23.3523669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3523901Z raise RuntimeError(msg) 2025-12-04T11:59:23.3524333Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3524727Z 2025-12-04T11:59:23.3524802Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3525160Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3525440Z 2025-12-04T11:59:23.3525529Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3525751Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3525915Z ======================= 1 failed, 1 deselected in 6.13s ======================== 2025-12-04T11:59:23.3526053Z Got exit code 1 2025-12-04T11:59:23.3526149Z Retrying single test... 2025-12-04T11:59:23.3526419Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-8d06e9eb47bd46ec.xml 2025-12-04T11:59:23.3526727Z ============================= test session starts ============================== 2025-12-04T11:59:23.3526939Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3527128Z cachedir: .pytest_cache 2025-12-04T11:59:23.3527352Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3527591Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3527717Z configfile: pytest.ini 2025-12-04T11:59:23.3527948Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3528221Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3528570Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3528886Z Running 1 items in this shard 2025-12-04T11:59:23.3528958Z 2025-12-04T11:59:23.3529279Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:55:40.737000 207423 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 207492 2025-12-04T11:59:23.3529827Z I1204 11:55:40.737000 207423 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 207493 2025-12-04T11:59:23.3530179Z I1204 11:55:40.738000 207423 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 207494 2025-12-04T11:59:23.3530518Z I1204 11:55:40.739000 207423 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 207495 2025-12-04T11:59:23.3531231Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3531820Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3532407Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3532991Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3533587Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3534165Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3534750Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3535370Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3535606Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3535952Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3536442Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3536922Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3537405Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3537853Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3538302Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3538764Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3539223Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3539718Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3540183Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3540665Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3541122Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3541592Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3542268Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3542905Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3543256Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3543866Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3544419Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3544783Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3545200Z [rank3]:E1204 11:55:45.996000 207495 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3545442Z dist init r=3, world=4 2025-12-04T11:59:23.3545650Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3545987Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3546474Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3546958Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3547442Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3547894Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3548335Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3548801Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3549267Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3549804Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3550268Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3550719Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3551174Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3551642Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3552318Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:59:23.3552947Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3553296Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3553930Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3554453Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3554818Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3555231Z [rank0]:E1204 11:55:46.004000 207492 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3555474Z dist init r=0, world=4 2025-12-04T11:59:23.3555676Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3556012Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3556501Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3556983Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3557465Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3557916Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3558358Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3558822Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3559312Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3559813Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3560280Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3560731Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3561188Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3561657Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3562332Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3562992Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3563340Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3563943Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3564459Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3564822Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3565238Z [rank1]:E1204 11:55:46.012000 207493 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3565481Z dist init r=1, world=4 2025-12-04T11:59:23.3565686Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3566024Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3566511Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3566989Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3567472Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3567923Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3568405Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3568870Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3569334Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3569851Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3570317Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3570771Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3571228Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3571696Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3572401Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:59:23.3573035Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3573385Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3573993Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3574510Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3574872Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3575288Z [rank2]:E1204 11:55:46.060000 207494 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3575530Z dist init r=2, world=4 2025-12-04T11:59:23.3575633Z FAILED [6.2146s] [100%] 2025-12-04T11:59:23.3575695Z 2025-12-04T11:59:23.3575759Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3575956Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __ 2025-12-04T11:59:23.3576144Z Traceback (most recent call last): 2025-12-04T11:59:23.3576390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3576638Z self._join_processes(fn) 2025-12-04T11:59:23.3576888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3577158Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3577458Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3577721Z raise RuntimeError(error) 2025-12-04T11:59:23.3577874Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3578037Z Traceback (most recent call last): 2025-12-04T11:59:23.3578278Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3578521Z getattr(self, test_name)() 2025-12-04T11:59:23.3578760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3578990Z fn() 2025-12-04T11:59:23.3579194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3579427Z method(*args, **kwargs) 2025-12-04T11:59:23.3579698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3579928Z method(*args, **kwargs) 2025-12-04T11:59:23.3580148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3580377Z with policy(): 2025-12-04T11:59:23.3580592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3580825Z raise RuntimeError(msg) 2025-12-04T11:59:23.3581290Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3581679Z 2025-12-04T11:59:23.3581758Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3582117Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3582399Z 2025-12-04T11:59:23.3582488Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3582616Z 2025-12-04T11:59:23.3582618Z 2025-12-04T11:59:23.3582696Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3582901Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3583284Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-8d06e9eb47bd46ec.xml - 2025-12-04T11:59:23.3583634Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3583994Z FAILED [6.2146s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3584335Z Traceback (most recent call last): 2025-12-04T11:59:23.3584581Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3584827Z getattr(self, test_name)() 2025-12-04T11:59:23.3585062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3585302Z fn() 2025-12-04T11:59:23.3585505Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3585737Z method(*args, **kwargs) 2025-12-04T11:59:23.3585957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3586187Z method(*args, **kwargs) 2025-12-04T11:59:23.3586436Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3586665Z with policy(): 2025-12-04T11:59:23.3586878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3587110Z raise RuntimeError(msg) 2025-12-04T11:59:23.3587541Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3587934Z 2025-12-04T11:59:23.3588010Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3588370Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3588654Z 2025-12-04T11:59:23.3588745Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3588934Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3589100Z ======================= 1 failed, 7 deselected in 6.22s ======================== 2025-12-04T11:59:23.3589240Z Got exit code 1 2025-12-04T11:59:23.3589335Z Retrying single test... 2025-12-04T11:59:23.3589650Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d90e5df4b924c6b2.xml 2025-12-04T11:59:23.3589987Z ============================= test session starts ============================== 2025-12-04T11:59:23.3590200Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3590393Z cachedir: .pytest_cache 2025-12-04T11:59:23.3590620Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3590862Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3590980Z configfile: pytest.ini 2025-12-04T11:59:23.3591210Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3591487Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3591833Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3592153Z Running 1 items in this shard 2025-12-04T11:59:23.3592226Z 2025-12-04T11:59:23.3592550Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:55:49.426000 207801 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 207870 2025-12-04T11:59:23.3593067Z I1204 11:55:49.427000 207801 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 207871 2025-12-04T11:59:23.3593417Z I1204 11:55:49.428000 207801 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 207872 2025-12-04T11:59:23.3593758Z I1204 11:55:49.428000 207801 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 207873 2025-12-04T11:59:23.3594450Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3595051Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3595666Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3596253Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3596838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3597420Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3598011Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3598594Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3598831Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3599203Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3599734Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3600217Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3600699Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3601155Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3601602Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3602067Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3602535Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3603004Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3603473Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3603928Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3604388Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3604896Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3605572Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3606207Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3606558Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3607169Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3607689Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3608057Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3608507Z [rank3]:E1204 11:55:54.729000 207873 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3608753Z dist init r=3, world=4 2025-12-04T11:59:23.3608957Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3609295Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3609815Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3610301Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3610783Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3611237Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3611683Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3612156Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3612631Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3613104Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3613572Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3614064Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3614523Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3614993Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3615665Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464. 2025-12-04T11:59:23.3616298Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3616651Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3617261Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3617818Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3618181Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3618596Z [rank2]:E1204 11:55:54.733000 207872 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3618839Z dist init r=2, world=4 2025-12-04T11:59:23.3619040Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3619380Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3619904Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3620391Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3620878Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3621328Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3621771Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3622237Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3622707Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3623171Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3623670Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3624127Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3624584Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3625056Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3625735Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680. 2025-12-04T11:59:23.3626368Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3626720Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3627361Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3627873Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3628237Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3628648Z [rank1]:E1204 11:55:54.745000 207871 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3628893Z dist init r=1, world=4 2025-12-04T11:59:23.3629090Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3629433Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3629950Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3630428Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3630904Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3631351Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3631795Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3632259Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3632749Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3633211Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3633671Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3634126Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3634577Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3635046Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3635717Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016. 2025-12-04T11:59:23.3636379Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3636726Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3637332Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3637846Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3638202Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3638613Z [rank0]:E1204 11:55:54.755000 207870 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3638848Z dist init r=0, world=4 2025-12-04T11:59:23.3638945Z FAILED [6.2159s] [100%] 2025-12-04T11:59:23.3639006Z 2025-12-04T11:59:23.3639064Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3639268Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __ 2025-12-04T11:59:23.3639449Z Traceback (most recent call last): 2025-12-04T11:59:23.3639730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3639968Z self._join_processes(fn) 2025-12-04T11:59:23.3640212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3640476Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3640744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3641002Z raise RuntimeError(error) 2025-12-04T11:59:23.3641159Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3641323Z Traceback (most recent call last): 2025-12-04T11:59:23.3641592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3641839Z getattr(self, test_name)() 2025-12-04T11:59:23.3642073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3642311Z fn() 2025-12-04T11:59:23.3642515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3642751Z method(*args, **kwargs) 2025-12-04T11:59:23.3642973Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3643208Z method(*args, **kwargs) 2025-12-04T11:59:23.3643429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3643660Z with policy(): 2025-12-04T11:59:23.3643878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3644115Z raise RuntimeError(msg) 2025-12-04T11:59:23.3644541Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3644935Z 2025-12-04T11:59:23.3645046Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3645408Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3645692Z 2025-12-04T11:59:23.3645782Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3645910Z 2025-12-04T11:59:23.3645912Z 2025-12-04T11:59:23.3645991Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3646197Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3646575Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d90e5df4b924c6b2.xml - 2025-12-04T11:59:23.3646920Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3647287Z FAILED [6.2159s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3647626Z Traceback (most recent call last): 2025-12-04T11:59:23.3647873Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3648120Z getattr(self, test_name)() 2025-12-04T11:59:23.3648355Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3648592Z fn() 2025-12-04T11:59:23.3648799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3649034Z method(*args, **kwargs) 2025-12-04T11:59:23.3649259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3649494Z method(*args, **kwargs) 2025-12-04T11:59:23.3649759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3649990Z with policy(): 2025-12-04T11:59:23.3650207Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3650443Z raise RuntimeError(msg) 2025-12-04T11:59:23.3650907Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816. 2025-12-04T11:59:23.3651304Z 2025-12-04T11:59:23.3651381Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3651742Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3652021Z 2025-12-04T11:59:23.3652114Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3652308Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3652375Z ======================= 1 failed, 7 deselected in 6.23s ======================== 2025-12-04T11:59:23.3652423Z Got exit code 1 2025-12-04T11:59:23.3652617Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda 2025-12-04T11:59:23.3652751Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:59:23.3652960Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d02ca8307c9a881c.xml 2025-12-04T11:59:23.3653058Z ============================= test session starts ============================== 2025-12-04T11:59:23.3653174Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3653220Z cachedir: .pytest_cache 2025-12-04T11:59:23.3653385Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3653439Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3653481Z configfile: pytest.ini 2025-12-04T11:59:23.3653650Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3653724Z collecting ... collected 8 items / 2 deselected / 6 selected 2025-12-04T11:59:23.3653784Z stepcurrent: skipping 2 already run items. 2025-12-04T11:59:23.3653829Z Running 6 items in this shard 2025-12-04T11:59:23.3653831Z 2025-12-04T11:59:23.3654193Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:55:58.229000 208179 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 208248 2025-12-04T11:59:23.3654355Z I1204 11:55:58.229000 208179 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 208249 2025-12-04T11:59:23.3654511Z I1204 11:55:58.230000 208179 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 208250 2025-12-04T11:59:23.3654668Z I1204 11:55:58.230000 208179 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 208251 2025-12-04T11:59:23.3655169Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3655238Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3655754Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3655818Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3656307Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3656367Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3656859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3656915Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3657059Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3657225Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3657540Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3657701Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3657994Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3658124Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3658406Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3658563Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3658842Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3658993Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3659273Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3659412Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3659748Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3659898Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3660454Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3660576Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3660776Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3661189Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3661305Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3661521Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3661687Z [rank2]:E1204 11:56:05.254000 208250 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3661767Z dist init r=2, world=4 2025-12-04T11:59:23.3661909Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3662074Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3662369Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3662524Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3662816Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3662946Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3663231Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3663381Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3663664Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3663814Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3664094Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3664235Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3664536Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3664687Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3665203Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3665323Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3665524Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3665936Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3666051Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3666287Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3666456Z [rank1]:E1204 11:56:05.258000 208249 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3666495Z dist init r=1, world=4 2025-12-04T11:59:23.3666637Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3666796Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3672403Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3672568Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3672853Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3672981Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3673259Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3673408Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3673687Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3673834Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3674156Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3674293Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3674576Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3674728Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3675245Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3675358Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3675555Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3675962Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3676107Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3676320Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3676486Z [rank3]:E1204 11:56:05.275000 208251 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3676526Z dist init r=3, world=4 2025-12-04T11:59:23.3676664Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3676824Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3677112Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3677268Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3677554Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3677677Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3677957Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3678104Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3678625Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3678775Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3679053Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3679197Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3679479Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3679668Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3680200Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3680341Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3680542Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3680948Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3681063Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3681278Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3681445Z [rank0]:E1204 11:56:05.336000 208248 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3681487Z dist init r=0, world=4 2025-12-04T11:59:23.3681846Z [rank0]:[W1204 11:56:05.732202649 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3681890Z FAILED [8.9182s] [ 16%] 2025-12-04T11:59:23.3681892Z 2025-12-04T11:59:23.3681951Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3682090Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _ 2025-12-04T11:59:23.3682137Z Traceback (most recent call last): 2025-12-04T11:59:23.3682306Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3682353Z self._join_processes(fn) 2025-12-04T11:59:23.3682533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3682590Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3682811Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3682856Z raise RuntimeError(error) 2025-12-04T11:59:23.3682942Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3682989Z Traceback (most recent call last): 2025-12-04T11:59:23.3683153Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3683198Z getattr(self, test_name)() 2025-12-04T11:59:23.3683357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3683399Z fn() 2025-12-04T11:59:23.3683554Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3683599Z method(*args, **kwargs) 2025-12-04T11:59:23.3683754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3683800Z method(*args, **kwargs) 2025-12-04T11:59:23.3683952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3683996Z with policy(): 2025-12-04T11:59:23.3684151Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3684195Z raise RuntimeError(msg) 2025-12-04T11:59:23.3684592Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3684618Z 2025-12-04T11:59:23.3684698Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3684984Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3684990Z 2025-12-04T11:59:23.3685081Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3685083Z 2025-12-04T11:59:23.3685146Z Process 2 exited with error code 10 and exception: 2025-12-04T11:59:23.3685192Z Traceback (most recent call last): 2025-12-04T11:59:23.3685361Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3685406Z getattr(self, test_name)() 2025-12-04T11:59:23.3685571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3685607Z fn() 2025-12-04T11:59:23.3685762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3685803Z method(*args, **kwargs) 2025-12-04T11:59:23.3685957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3685997Z method(*args, **kwargs) 2025-12-04T11:59:23.3686149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3686187Z with policy(): 2025-12-04T11:59:23.3686344Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3686385Z raise RuntimeError(msg) 2025-12-04T11:59:23.3686799Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3686802Z 2025-12-04T11:59:23.3686879Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3687160Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3687162Z 2025-12-04T11:59:23.3687256Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3687260Z 2025-12-04T11:59:23.3687319Z Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3687367Z Traceback (most recent call last): 2025-12-04T11:59:23.3687531Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3687577Z getattr(self, test_name)() 2025-12-04T11:59:23.3687740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3687778Z fn() 2025-12-04T11:59:23.3687931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3687973Z method(*args, **kwargs) 2025-12-04T11:59:23.3688124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3688166Z method(*args, **kwargs) 2025-12-04T11:59:23.3688343Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3688384Z with policy(): 2025-12-04T11:59:23.3688535Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3688580Z raise RuntimeError(msg) 2025-12-04T11:59:23.3688971Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3688973Z 2025-12-04T11:59:23.3689047Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3689329Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3689333Z 2025-12-04T11:59:23.3689419Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3689421Z 2025-12-04T11:59:23.3689423Z 2025-12-04T11:59:23.3689504Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3689627Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3689885Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d02ca8307c9a881c.xml - 2025-12-04T11:59:23.3689946Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3690246Z FAILED [8.9182s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3690295Z Traceback (most recent call last): 2025-12-04T11:59:23.3690462Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3690507Z getattr(self, test_name)() 2025-12-04T11:59:23.3690670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3690735Z fn() 2025-12-04T11:59:23.3690891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3690933Z method(*args, **kwargs) 2025-12-04T11:59:23.3691084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3691126Z method(*args, **kwargs) 2025-12-04T11:59:23.3691277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3691320Z with policy(): 2025-12-04T11:59:23.3691475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3691518Z raise RuntimeError(msg) 2025-12-04T11:59:23.3691908Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3691910Z 2025-12-04T11:59:23.3691987Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3692269Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3692299Z 2025-12-04T11:59:23.3692385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3692387Z 2025-12-04T11:59:23.3692447Z Process 2 exited with error code 10 and exception: 2025-12-04T11:59:23.3692491Z Traceback (most recent call last): 2025-12-04T11:59:23.3692655Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3692700Z getattr(self, test_name)() 2025-12-04T11:59:23.3692863Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3692897Z fn() 2025-12-04T11:59:23.3693052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3693091Z method(*args, **kwargs) 2025-12-04T11:59:23.3693246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3693287Z method(*args, **kwargs) 2025-12-04T11:59:23.3693440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3693477Z with policy(): 2025-12-04T11:59:23.3693637Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3693679Z raise RuntimeError(msg) 2025-12-04T11:59:23.3694071Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3694074Z 2025-12-04T11:59:23.3694149Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3694433Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3694435Z 2025-12-04T11:59:23.3694524Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3694526Z 2025-12-04T11:59:23.3694584Z Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3694655Z Traceback (most recent call last): 2025-12-04T11:59:23.3694819Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3694868Z getattr(self, test_name)() 2025-12-04T11:59:23.3695028Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3695066Z fn() 2025-12-04T11:59:23.3695219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3695264Z method(*args, **kwargs) 2025-12-04T11:59:23.3695415Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3695457Z method(*args, **kwargs) 2025-12-04T11:59:23.3695609Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3695651Z with policy(): 2025-12-04T11:59:23.3695807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3695848Z raise RuntimeError(msg) 2025-12-04T11:59:23.3696241Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3696264Z 2025-12-04T11:59:23.3696338Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3696622Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3696626Z 2025-12-04T11:59:23.3696712Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3696779Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3696843Z ======================= 1 failed, 2 deselected in 8.93s ======================== 2025-12-04T11:59:23.3696883Z Got exit code 1 2025-12-04T11:59:23.3696924Z Retrying single test... 2025-12-04T11:59:23.3697135Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d5eb9510b4a5dc42.xml 2025-12-04T11:59:23.3697197Z ============================= test session starts ============================== 2025-12-04T11:59:23.3697316Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3697361Z cachedir: .pytest_cache 2025-12-04T11:59:23.3697527Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3697576Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3697617Z configfile: pytest.ini 2025-12-04T11:59:23.3697787Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3697860Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3698137Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3698183Z Running 1 items in this shard 2025-12-04T11:59:23.3698185Z 2025-12-04T11:59:23.3698568Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:56:09.846000 208581 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 208650 2025-12-04T11:59:23.3698727Z I1204 11:56:09.847000 208581 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 208651 2025-12-04T11:59:23.3698882Z I1204 11:56:09.847000 208581 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 208652 2025-12-04T11:59:23.3699032Z I1204 11:56:09.848000 208581 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 208653 2025-12-04T11:59:23.3699539Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3699650Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3700148Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3700213Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3700728Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3700792Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3701285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3701342Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3701490Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3701655Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3701955Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3702114Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3702409Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3702539Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3702820Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3702970Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3703273Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3703423Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3703701Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3703842Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3704126Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3704274Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3704801Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3704939Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3705138Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3705547Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3705664Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3705883Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3706050Z [rank2]:E1204 11:56:16.875000 208652 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3706090Z dist init r=2, world=4 2025-12-04T11:59:23.3706232Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3706393Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3706683Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3706838Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3707130Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3707254Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3707587Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3707735Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3708014Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3708165Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3708444Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3708582Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3708861Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3709011Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3709553Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3709710Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3709908Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3710315Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3710431Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3710644Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3710810Z [rank1]:E1204 11:56:16.877000 208651 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3710849Z dist init r=1, world=4 2025-12-04T11:59:23.3710989Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3711149Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3711439Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3711593Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3711905Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3712027Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3712304Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3712455Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3712733Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3712882Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3713162Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3713297Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3713619Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3713767Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3714290Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3714407Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3714604Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3715013Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3715126Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3715338Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3715502Z [rank3]:E1204 11:56:16.880000 208653 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3715542Z dist init r=3, world=4 2025-12-04T11:59:23.3715677Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3715837Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3716144Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3716298Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3716582Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3716706Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3716984Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3717132Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3717410Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3717557Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3717850Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3717987Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3718265Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3718415Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3718927Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3719044Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3719244Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3719719Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3719834Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3720045Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3720211Z [rank0]:E1204 11:56:16.924000 208650 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3720250Z dist init r=0, world=4 2025-12-04T11:59:23.3720617Z [rank0]:[W1204 11:56:17.210056491 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3720659Z FAILED [9.0183s] [100%] 2025-12-04T11:59:23.3720662Z 2025-12-04T11:59:23.3720719Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3720860Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _ 2025-12-04T11:59:23.3720906Z Traceback (most recent call last): 2025-12-04T11:59:23.3721073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3721116Z self._join_processes(fn) 2025-12-04T11:59:23.3721294Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3721347Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3721531Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3721575Z raise RuntimeError(error) 2025-12-04T11:59:23.3721657Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3721702Z Traceback (most recent call last): 2025-12-04T11:59:23.3721894Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3721937Z getattr(self, test_name)() 2025-12-04T11:59:23.3722099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3722134Z fn() 2025-12-04T11:59:23.3722291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3722332Z method(*args, **kwargs) 2025-12-04T11:59:23.3722485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3722525Z method(*args, **kwargs) 2025-12-04T11:59:23.3722679Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3722718Z with policy(): 2025-12-04T11:59:23.3722873Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3722916Z raise RuntimeError(msg) 2025-12-04T11:59:23.3723307Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3723310Z 2025-12-04T11:59:23.3723388Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3723670Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3723672Z 2025-12-04T11:59:23.3723763Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3723767Z 2025-12-04T11:59:23.3723826Z Process 2 exited with error code 10 and exception: 2025-12-04T11:59:23.3723873Z Traceback (most recent call last): 2025-12-04T11:59:23.3724039Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3724082Z getattr(self, test_name)() 2025-12-04T11:59:23.3724263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3724300Z fn() 2025-12-04T11:59:23.3724453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3724492Z method(*args, **kwargs) 2025-12-04T11:59:23.3724645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3724686Z method(*args, **kwargs) 2025-12-04T11:59:23.3724840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3724877Z with policy(): 2025-12-04T11:59:23.3725032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3725073Z raise RuntimeError(msg) 2025-12-04T11:59:23.3725465Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3725468Z 2025-12-04T11:59:23.3725542Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3725827Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3725850Z 2025-12-04T11:59:23.3725936Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3725939Z 2025-12-04T11:59:23.3725998Z Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3726043Z Traceback (most recent call last): 2025-12-04T11:59:23.3726209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3726251Z getattr(self, test_name)() 2025-12-04T11:59:23.3726414Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3726448Z fn() 2025-12-04T11:59:23.3726597Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3726639Z method(*args, **kwargs) 2025-12-04T11:59:23.3726789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3726827Z method(*args, **kwargs) 2025-12-04T11:59:23.3726979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3727015Z with policy(): 2025-12-04T11:59:23.3727169Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3727211Z raise RuntimeError(msg) 2025-12-04T11:59:23.3727598Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3727602Z 2025-12-04T11:59:23.3727676Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3727958Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3727961Z 2025-12-04T11:59:23.3728046Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3728069Z 2025-12-04T11:59:23.3728071Z 2025-12-04T11:59:23.3728149Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3728237Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3728493Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d5eb9510b4a5dc42.xml - 2025-12-04T11:59:23.3728556Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3728853Z FAILED [9.0183s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3728898Z Traceback (most recent call last): 2025-12-04T11:59:23.3729066Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3729107Z getattr(self, test_name)() 2025-12-04T11:59:23.3729268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3729302Z fn() 2025-12-04T11:59:23.3729456Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3729496Z method(*args, **kwargs) 2025-12-04T11:59:23.3729721Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3729761Z method(*args, **kwargs) 2025-12-04T11:59:23.3729915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3729954Z with policy(): 2025-12-04T11:59:23.3730111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3730152Z raise RuntimeError(msg) 2025-12-04T11:59:23.3730539Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3730542Z 2025-12-04T11:59:23.3730615Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3730895Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3730897Z 2025-12-04T11:59:23.3730985Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3730987Z 2025-12-04T11:59:23.3731046Z Process 2 exited with error code 10 and exception: 2025-12-04T11:59:23.3731092Z Traceback (most recent call last): 2025-12-04T11:59:23.3731255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3731297Z getattr(self, test_name)() 2025-12-04T11:59:23.3731460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3731494Z fn() 2025-12-04T11:59:23.3731647Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3731687Z method(*args, **kwargs) 2025-12-04T11:59:23.3731836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3731874Z method(*args, **kwargs) 2025-12-04T11:59:23.3732055Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3732091Z with policy(): 2025-12-04T11:59:23.3732242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3732283Z raise RuntimeError(msg) 2025-12-04T11:59:23.3732670Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3732674Z 2025-12-04T11:59:23.3732746Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3733028Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3733030Z 2025-12-04T11:59:23.3733117Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3733119Z 2025-12-04T11:59:23.3733178Z Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3733222Z Traceback (most recent call last): 2025-12-04T11:59:23.3733388Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3733458Z getattr(self, test_name)() 2025-12-04T11:59:23.3733619Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3733653Z fn() 2025-12-04T11:59:23.3733802Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3733841Z method(*args, **kwargs) 2025-12-04T11:59:23.3733992Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3734032Z method(*args, **kwargs) 2025-12-04T11:59:23.3734180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3734217Z with policy(): 2025-12-04T11:59:23.3734369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3734411Z raise RuntimeError(msg) 2025-12-04T11:59:23.3734797Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3734799Z 2025-12-04T11:59:23.3734874Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3735150Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3735152Z 2025-12-04T11:59:23.3735236Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3735301Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3735364Z ======================= 1 failed, 7 deselected in 9.03s ======================== 2025-12-04T11:59:23.3735401Z Got exit code 1 2025-12-04T11:59:23.3735440Z Retrying single test... 2025-12-04T11:59:23.3735649Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-db14df267c0d86ac.xml 2025-12-04T11:59:23.3735707Z ============================= test session starts ============================== 2025-12-04T11:59:23.3735841Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3735881Z cachedir: .pytest_cache 2025-12-04T11:59:23.3736041Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3736087Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3736126Z configfile: pytest.ini 2025-12-04T11:59:23.3736289Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3736366Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3736637Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3736682Z Running 1 items in this shard 2025-12-04T11:59:23.3736684Z 2025-12-04T11:59:23.3737040Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:56:21.270000 208983 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 209052 2025-12-04T11:59:23.3737194Z I1204 11:56:21.271000 208983 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 209053 2025-12-04T11:59:23.3737349Z I1204 11:56:21.272000 208983 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 209054 2025-12-04T11:59:23.3737519Z I1204 11:56:21.272000 208983 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 209055 2025-12-04T11:59:23.3738020Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3738082Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3738569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3738632Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3739117Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3739175Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3739703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3739761Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3739905Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3740099Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3740392Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3740545Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3740834Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3740957Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3741238Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3741385Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3741667Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3741846Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3742123Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3742261Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3742540Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3742689Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3743206Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3743324Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3743521Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3743928Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3744043Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3744253Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3744445Z [rank0]:E1204 11:56:28.309000 209052 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3744484Z dist init r=0, world=4 2025-12-04T11:59:23.3744623Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3744783Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3745069Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3745223Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3745508Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3745630Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3745907Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3746079Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3746359Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3746506Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3746783Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3746918Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3747198Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3747345Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3747860Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3747973Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3748168Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3748581Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3748711Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3748920Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3749082Z [rank2]:E1204 11:56:28.313000 209054 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3749120Z dist init r=2, world=4 2025-12-04T11:59:23.3749259Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3749417Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3749752Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3749904Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3750188Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3750337Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3750615Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3750762Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3751042Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3751187Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3751462Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3751604Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3751884Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3752032Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3752543Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3752658Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3752853Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3753286Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3753399Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3753608Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3753773Z [rank3]:E1204 11:56:28.321000 209055 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3753811Z dist init r=3, world=4 2025-12-04T11:59:23.3753949Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3754107Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3754394Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3754547Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3754850Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3754971Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3755249Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3755398Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3755674Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3755824Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3756102Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3756238Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3756517Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3756664Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3757203Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3757317Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3757513Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3757921Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3758034Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3758245Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3758408Z [rank1]:E1204 11:56:28.325000 209053 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3758448Z dist init r=1, world=4 2025-12-04T11:59:23.3758785Z [rank0]:[W1204 11:56:28.470384681 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3758845Z FAILED [8.8188s] [100%] 2025-12-04T11:59:23.3758847Z 2025-12-04T11:59:23.3758903Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3759041Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _ 2025-12-04T11:59:23.3759086Z Traceback (most recent call last): 2025-12-04T11:59:23.3759251Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3759295Z self._join_processes(fn) 2025-12-04T11:59:23.3759471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3759524Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3759744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3759790Z raise RuntimeError(error) 2025-12-04T11:59:23.3759869Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.3759912Z Traceback (most recent call last): 2025-12-04T11:59:23.3760072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3760114Z getattr(self, test_name)() 2025-12-04T11:59:23.3760274Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3760308Z fn() 2025-12-04T11:59:23.3760459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3760499Z method(*args, **kwargs) 2025-12-04T11:59:23.3760648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3760689Z method(*args, **kwargs) 2025-12-04T11:59:23.3760839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3760875Z with policy(): 2025-12-04T11:59:23.3761027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3761068Z raise RuntimeError(msg) 2025-12-04T11:59:23.3761491Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3761493Z 2025-12-04T11:59:23.3761567Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3761845Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3761848Z 2025-12-04T11:59:23.3761934Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3761936Z 2025-12-04T11:59:23.3761938Z 2025-12-04T11:59:23.3762014Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3762104Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3762355Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-db14df267c0d86ac.xml - 2025-12-04T11:59:23.3762416Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3762706Z FAILED [8.8188s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.3762779Z Traceback (most recent call last): 2025-12-04T11:59:23.3762946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3762988Z getattr(self, test_name)() 2025-12-04T11:59:23.3763149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3763183Z fn() 2025-12-04T11:59:23.3763335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3763373Z method(*args, **kwargs) 2025-12-04T11:59:23.3763523Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3763564Z method(*args, **kwargs) 2025-12-04T11:59:23.3763716Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3763752Z with policy(): 2025-12-04T11:59:23.3763907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3763948Z raise RuntimeError(msg) 2025-12-04T11:59:23.3764345Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3764348Z 2025-12-04T11:59:23.3764420Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3764700Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3764704Z 2025-12-04T11:59:23.3764789Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3764852Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3764913Z ======================= 1 failed, 7 deselected in 8.83s ======================== 2025-12-04T11:59:23.3764974Z Got exit code 1 2025-12-04T11:59:23.3765201Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3765331Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:59:23.3765538Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-ea93d0d366cc9967.xml 2025-12-04T11:59:23.3765598Z ============================= test session starts ============================== 2025-12-04T11:59:23.3765710Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3765751Z cachedir: .pytest_cache 2025-12-04T11:59:23.3765909Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3765956Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3765995Z configfile: pytest.ini 2025-12-04T11:59:23.3766160Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3766232Z collecting ... collected 8 items / 3 deselected / 5 selected 2025-12-04T11:59:23.3766283Z stepcurrent: skipping 3 already run items. 2025-12-04T11:59:23.3766325Z Running 5 items in this shard 2025-12-04T11:59:23.3766327Z 2025-12-04T11:59:23.3766703Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:56:32.688000 209385 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 209454 2025-12-04T11:59:23.3766858Z I1204 11:56:32.689000 209385 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 209455 2025-12-04T11:59:23.3767012Z I1204 11:56:32.689000 209385 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 209456 2025-12-04T11:59:23.3767164Z I1204 11:56:32.690000 209385 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 209457 2025-12-04T11:59:23.3767657Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3767722Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3768211Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3768270Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3768758Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3768816Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3769324Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3769381Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3769522Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3769728Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3770023Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3770178Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3770469Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3770593Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3770872Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3771045Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3771326Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3771473Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3771749Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3771884Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3772164Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3772312Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3772826Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3772940Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3773137Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3773569Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3773682Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3773894Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3774058Z [rank0]:E1204 11:56:39.808000 209454 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3774098Z dist init r=0, world=4 2025-12-04T11:59:23.3774237Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3774395Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3774684Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3774836Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3775121Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3775268Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3775545Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3775695Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3775973Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3776119Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3776398Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3776535Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3776815Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3776963Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3777482Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3777597Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3777812Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3778218Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3778332Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3778541Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3778706Z [rank3]:E1204 11:56:39.810000 209457 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3778746Z dist init r=3, world=4 2025-12-04T11:59:23.3778883Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3779042Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3779329Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3779500Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3779814Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3779938Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3780214Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3780361Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3780641Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3780786Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3781064Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3781200Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3781480Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3781632Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3782175Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3782288Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3782482Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3782890Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3783003Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3783215Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3783381Z [rank2]:E1204 11:56:39.850000 209456 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3783418Z dist init r=2, world=4 2025-12-04T11:59:23.3783558Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3783740Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3784029Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3784184Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3784471Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3784595Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3784875Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3785024Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3785302Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3785450Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3785725Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3785865Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3786141Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3786310Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3786823Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3786938Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3787137Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3787543Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3787657Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3787867Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3788053Z [rank1]:E1204 11:56:39.863000 209455 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3788094Z dist init r=1, world=4 2025-12-04T11:59:23.3788433Z [rank0]:[W1204 11:56:39.974212773 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3788475Z FAILED [8.9210s] [ 20%] 2025-12-04T11:59:23.3788476Z 2025-12-04T11:59:23.3788534Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3788671Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _ 2025-12-04T11:59:23.3788717Z Traceback (most recent call last): 2025-12-04T11:59:23.3788886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3788932Z self._join_processes(fn) 2025-12-04T11:59:23.3789110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3789163Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3789346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3789390Z raise RuntimeError(error) 2025-12-04T11:59:23.3789473Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.3789517Z Traceback (most recent call last): 2025-12-04T11:59:23.3789712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3789755Z getattr(self, test_name)() 2025-12-04T11:59:23.3789916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3789952Z fn() 2025-12-04T11:59:23.3790106Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3790145Z method(*args, **kwargs) 2025-12-04T11:59:23.3790300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3790368Z method(*args, **kwargs) 2025-12-04T11:59:23.3790521Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3790557Z with policy(): 2025-12-04T11:59:23.3790712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3790755Z raise RuntimeError(msg) 2025-12-04T11:59:23.3791140Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3791144Z 2025-12-04T11:59:23.3791221Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3791507Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3791509Z 2025-12-04T11:59:23.3791598Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3791600Z 2025-12-04T11:59:23.3791602Z 2025-12-04T11:59:23.3791680Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3791768Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3792057Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-ea93d0d366cc9967.xml - 2025-12-04T11:59:23.3792119Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3792417Z FAILED [8.9210s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.3792461Z Traceback (most recent call last): 2025-12-04T11:59:23.3792629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3792672Z getattr(self, test_name)() 2025-12-04T11:59:23.3792835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3792872Z fn() 2025-12-04T11:59:23.3793025Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3793065Z method(*args, **kwargs) 2025-12-04T11:59:23.3793220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3793258Z method(*args, **kwargs) 2025-12-04T11:59:23.3793412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3793450Z with policy(): 2025-12-04T11:59:23.3793605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3793646Z raise RuntimeError(msg) 2025-12-04T11:59:23.3794039Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3794043Z 2025-12-04T11:59:23.3794120Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3794425Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3794427Z 2025-12-04T11:59:23.3794516Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3794580Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3794644Z ======================= 1 failed, 3 deselected in 8.93s ======================== 2025-12-04T11:59:23.3794680Z Got exit code 1 2025-12-04T11:59:23.3794725Z Retrying single test... 2025-12-04T11:59:23.3794934Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-4cc42fc7936fe35d.xml 2025-12-04T11:59:23.3794995Z ============================= test session starts ============================== 2025-12-04T11:59:23.3795109Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3795151Z cachedir: .pytest_cache 2025-12-04T11:59:23.3795312Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3795359Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3795399Z configfile: pytest.ini 2025-12-04T11:59:23.3795569Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3795642Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3795935Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3795980Z Running 1 items in this shard 2025-12-04T11:59:23.3795982Z 2025-12-04T11:59:23.3796333Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:56:44.456000 209787 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 209856 2025-12-04T11:59:23.3796492Z I1204 11:56:44.456000 209787 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 209857 2025-12-04T11:59:23.3796643Z I1204 11:56:44.457000 209787 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 209858 2025-12-04T11:59:23.3796796Z I1204 11:56:44.457000 209787 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 209859 2025-12-04T11:59:23.3797295Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3797357Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3797850Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3797911Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3798399Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3798476Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3798965Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3799025Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3799167Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3799331Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3799662Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3799821Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3800104Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3800258Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3800538Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3800688Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3800965Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3801114Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3801394Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3801531Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3801817Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3801966Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3802488Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3802604Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3802825Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3803235Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3803351Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3803564Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3803730Z [rank1]:E1204 11:56:51.563000 209857 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3803769Z dist init r=1, world=4 2025-12-04T11:59:23.3803908Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3804065Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3804353Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3804526Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3804812Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3804937Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3805216Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3805364Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3805643Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3805793Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3806071Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3806210Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3806490Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3806640Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3807191Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3807304Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3807501Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3807905Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3808021Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3808235Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3808399Z [rank2]:E1204 11:56:51.577000 209858 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3808440Z dist init r=2, world=4 2025-12-04T11:59:23.3808577Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3808756Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3809044Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3809199Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3809483Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3809643Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3809920Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3810068Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3810349Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3810495Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3810770Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3810908Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3811187Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3811362Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3811878Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3811994Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3812187Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3812591Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3812702Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3812914Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3813102Z [rank3]:E1204 11:56:51.589000 209859 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3813140Z dist init r=3, world=4 2025-12-04T11:59:23.3813278Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3813439Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3813727Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3813880Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3814166Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3814291Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3814580Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3814731Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3815009Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3815160Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3815436Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3815594Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3815873Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3816023Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3816541Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3816656Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3816855Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3817258Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3817393Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3817608Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3817773Z [rank0]:E1204 11:56:51.623000 209856 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3817813Z dist init r=0, world=4 2025-12-04T11:59:23.3818154Z [rank0]:[W1204 11:56:51.878778688 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3818196Z FAILED [9.0198s] [100%] 2025-12-04T11:59:23.3818198Z 2025-12-04T11:59:23.3818254Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3818393Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _ 2025-12-04T11:59:23.3818438Z Traceback (most recent call last): 2025-12-04T11:59:23.3818607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3818651Z self._join_processes(fn) 2025-12-04T11:59:23.3818829Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3818882Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3819063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3819106Z raise RuntimeError(error) 2025-12-04T11:59:23.3819186Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3819231Z Traceback (most recent call last): 2025-12-04T11:59:23.3819394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3819437Z getattr(self, test_name)() 2025-12-04T11:59:23.3819629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3819663Z fn() 2025-12-04T11:59:23.3819840Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3819882Z method(*args, **kwargs) 2025-12-04T11:59:23.3820034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3820076Z method(*args, **kwargs) 2025-12-04T11:59:23.3820227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3820268Z with policy(): 2025-12-04T11:59:23.3820420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3820463Z raise RuntimeError(msg) 2025-12-04T11:59:23.3820851Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3820853Z 2025-12-04T11:59:23.3820931Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3821213Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3821244Z 2025-12-04T11:59:23.3821333Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3821335Z 2025-12-04T11:59:23.3821337Z 2025-12-04T11:59:23.3821415Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3821502Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3821755Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-4cc42fc7936fe35d.xml - 2025-12-04T11:59:23.3821815Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3822107Z FAILED [9.0198s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3822154Z Traceback (most recent call last): 2025-12-04T11:59:23.3822319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3822361Z getattr(self, test_name)() 2025-12-04T11:59:23.3822524Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3822557Z fn() 2025-12-04T11:59:23.3822712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3822751Z method(*args, **kwargs) 2025-12-04T11:59:23.3822904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3822943Z method(*args, **kwargs) 2025-12-04T11:59:23.3823097Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3823138Z with policy(): 2025-12-04T11:59:23.3823293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3823335Z raise RuntimeError(msg) 2025-12-04T11:59:23.3823754Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3823756Z 2025-12-04T11:59:23.3823832Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3824110Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3824112Z 2025-12-04T11:59:23.3824201Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3824263Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3824324Z ======================= 1 failed, 7 deselected in 9.03s ======================== 2025-12-04T11:59:23.3824361Z Got exit code 1 2025-12-04T11:59:23.3824402Z Retrying single test... 2025-12-04T11:59:23.3824609Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-fa20f3f21d04c3a6.xml 2025-12-04T11:59:23.3824670Z ============================= test session starts ============================== 2025-12-04T11:59:23.3824784Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3824823Z cachedir: .pytest_cache 2025-12-04T11:59:23.3824983Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3825051Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3825091Z configfile: pytest.ini 2025-12-04T11:59:23.3825255Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3825328Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3825597Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3825640Z Running 1 items in this shard 2025-12-04T11:59:23.3825642Z 2025-12-04T11:59:23.3825996Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:56:56.110000 210189 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 210258 2025-12-04T11:59:23.3826152Z I1204 11:56:56.110000 210189 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 210259 2025-12-04T11:59:23.3826304Z I1204 11:56:56.111000 210189 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 210260 2025-12-04T11:59:23.3826454Z I1204 11:56:56.111000 210189 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 210261 2025-12-04T11:59:23.3826951Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3827012Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3827502Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3827563Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3828067Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3828126Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3828606Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3828663Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3828806Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3828970Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3829262Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3829437Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3829766Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3829891Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3830169Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3830317Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3830596Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3830743Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3831020Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3831157Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3831435Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3831586Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3832125Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3832243Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3832438Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3832843Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3832961Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3833173Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3833338Z [rank1]:E1204 11:57:03.214000 210259 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3833377Z dist init r=1, world=4 2025-12-04T11:59:23.3833516Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3833700Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3833989Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3834144Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3834431Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3834554Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3834832Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3834985Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3835267Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3835413Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3835689Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3835826Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3836106Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3836273Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3836787Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3836901Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3837096Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3837505Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3837618Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3837829Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3838019Z [rank2]:E1204 11:57:03.215000 210260 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3838059Z dist init r=2, world=4 2025-12-04T11:59:23.3838195Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3838355Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3838643Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3838798Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3839084Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3839208Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3839491Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3839676Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3839955Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3840105Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3840388Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3840550Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3840831Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3840982Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3841494Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3841611Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3841807Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3842213Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3842353Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3842564Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3842732Z [rank0]:E1204 11:57:03.226000 210258 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3842770Z dist init r=0, world=4 2025-12-04T11:59:23.3842908Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3843068Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3843356Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3843510Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3843799Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3843925Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3844204Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3844354Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3844631Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3844780Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3845077Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3845216Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3845501Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3845651Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3846172Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3846283Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3846480Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3846901Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3847016Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3847229Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3847394Z [rank3]:E1204 11:57:03.274000 210261 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3847434Z dist init r=3, world=4 2025-12-04T11:59:23.3847775Z [rank0]:[W1204 11:57:03.394562699 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3847817Z FAILED [9.0182s] [100%] 2025-12-04T11:59:23.3847819Z 2025-12-04T11:59:23.3847876Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3848015Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _ 2025-12-04T11:59:23.3848060Z Traceback (most recent call last): 2025-12-04T11:59:23.3848225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3848268Z self._join_processes(fn) 2025-12-04T11:59:23.3848442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3848496Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3848676Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3848722Z raise RuntimeError(error) 2025-12-04T11:59:23.3848802Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3848849Z Traceback (most recent call last): 2025-12-04T11:59:23.3849029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3849071Z getattr(self, test_name)() 2025-12-04T11:59:23.3849231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3849266Z fn() 2025-12-04T11:59:23.3849418Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3849462Z method(*args, **kwargs) 2025-12-04T11:59:23.3849652Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3849694Z method(*args, **kwargs) 2025-12-04T11:59:23.3849847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3849885Z with policy(): 2025-12-04T11:59:23.3850041Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3850085Z raise RuntimeError(msg) 2025-12-04T11:59:23.3850470Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3850500Z 2025-12-04T11:59:23.3850580Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3850860Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3850862Z 2025-12-04T11:59:23.3850948Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3850952Z 2025-12-04T11:59:23.3850954Z 2025-12-04T11:59:23.3851031Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3851118Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3851371Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-fa20f3f21d04c3a6.xml - 2025-12-04T11:59:23.3851434Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3851730Z FAILED [9.0182s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3851774Z Traceback (most recent call last): 2025-12-04T11:59:23.3851942Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3851984Z getattr(self, test_name)() 2025-12-04T11:59:23.3852147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3852184Z fn() 2025-12-04T11:59:23.3852337Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3852377Z method(*args, **kwargs) 2025-12-04T11:59:23.3852530Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3852572Z method(*args, **kwargs) 2025-12-04T11:59:23.3852723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3852761Z with policy(): 2025-12-04T11:59:23.3852940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3852983Z raise RuntimeError(msg) 2025-12-04T11:59:23.3853368Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3853370Z 2025-12-04T11:59:23.3853449Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3853729Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3853731Z 2025-12-04T11:59:23.3853821Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3853887Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3853949Z ======================= 1 failed, 7 deselected in 9.03s ======================== 2025-12-04T11:59:23.3853989Z Got exit code 1 2025-12-04T11:59:23.3854215Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3854346Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:59:23.3854580Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-b3741d276df85ab4.xml 2025-12-04T11:59:23.3854640Z ============================= test session starts ============================== 2025-12-04T11:59:23.3854751Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3854793Z cachedir: .pytest_cache 2025-12-04T11:59:23.3854951Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3855000Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3855039Z configfile: pytest.ini 2025-12-04T11:59:23.3855206Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3855276Z collecting ... collected 8 items / 4 deselected / 4 selected 2025-12-04T11:59:23.3855332Z stepcurrent: skipping 4 already run items. 2025-12-04T11:59:23.3855373Z Running 4 items in this shard 2025-12-04T11:59:23.3855375Z 2025-12-04T11:59:23.3855730Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:57:07.665000 210591 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 210660 2025-12-04T11:59:23.3855887Z I1204 11:57:07.666000 210591 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 210661 2025-12-04T11:59:23.3856039Z I1204 11:57:07.666000 210591 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 210662 2025-12-04T11:59:23.3856190Z I1204 11:57:07.667000 210591 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 210663 2025-12-04T11:59:23.3856689Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3856755Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3857267Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3857330Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3857819Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3857878Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3858366Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3858422Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3858566Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3858750Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3859040Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3859198Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3859482Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3859652Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3859932Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3860082Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3860360Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3860510Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3860787Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3860927Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3861207Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3861382Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3861901Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3862017Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3862216Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3862625Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3862737Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3862949Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3863136Z [rank3]:E1204 11:57:14.805000 210663 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3863176Z dist init r=3, world=4 2025-12-04T11:59:23.3863311Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3863472Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3863761Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3863917Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3864205Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3864329Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3864610Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3864757Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3865035Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3865182Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3865458Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3865616Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3865897Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3866045Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3866558Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3866675Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3866870Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3867272Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3867406Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3867614Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3867778Z [rank2]:E1204 11:57:14.813000 210662 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3867817Z dist init r=2, world=4 2025-12-04T11:59:23.3867955Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3868116Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3868408Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3868563Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3868847Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3868973Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3869248Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3869397Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3869705Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3869891Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3870167Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3870300Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3870584Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3870732Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3871248Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3871362Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3871581Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3871983Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3872097Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3872309Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3872472Z [rank1]:E1204 11:57:14.822000 210661 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3872514Z dist init r=1, world=4 2025-12-04T11:59:23.3872650Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3872810Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3873098Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3873252Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3873538Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3873662Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3873940Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3874108Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3874386Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3874530Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3874806Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3874943Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3875222Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3875371Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3875881Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3876018Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3876215Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3876622Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3876736Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3876947Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3877112Z [rank0]:E1204 11:57:14.871000 210660 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3877150Z dist init r=0, world=4 2025-12-04T11:59:23.3877492Z [rank0]:[W1204 11:57:15.230124198 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3877532Z FAILED [8.9190s] [ 25%] 2025-12-04T11:59:23.3877534Z 2025-12-04T11:59:23.3877592Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3877726Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _ 2025-12-04T11:59:23.3877774Z Traceback (most recent call last): 2025-12-04T11:59:23.3877937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3877979Z self._join_processes(fn) 2025-12-04T11:59:23.3878155Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3878227Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3878407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3878451Z raise RuntimeError(error) 2025-12-04T11:59:23.3878530Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3878574Z Traceback (most recent call last): 2025-12-04T11:59:23.3878735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3878779Z getattr(self, test_name)() 2025-12-04T11:59:23.3878938Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3878972Z fn() 2025-12-04T11:59:23.3879124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3879166Z method(*args, **kwargs) 2025-12-04T11:59:23.3879321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3879361Z method(*args, **kwargs) 2025-12-04T11:59:23.3879512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3879548Z with policy(): 2025-12-04T11:59:23.3879739Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3879811Z raise RuntimeError(msg) 2025-12-04T11:59:23.3880201Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3880205Z 2025-12-04T11:59:23.3880283Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3880561Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3880563Z 2025-12-04T11:59:23.3880652Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3880655Z 2025-12-04T11:59:23.3880657Z 2025-12-04T11:59:23.3880732Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3880823Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3881074Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-b3741d276df85ab4.xml - 2025-12-04T11:59:23.3881138Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3881430Z FAILED [8.9190s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3881474Z Traceback (most recent call last): 2025-12-04T11:59:23.3881639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3881681Z getattr(self, test_name)() 2025-12-04T11:59:23.3881846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3881879Z fn() 2025-12-04T11:59:23.3882036Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3882075Z method(*args, **kwargs) 2025-12-04T11:59:23.3882252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3882292Z method(*args, **kwargs) 2025-12-04T11:59:23.3882444Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3882480Z with policy(): 2025-12-04T11:59:23.3882636Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3882678Z raise RuntimeError(msg) 2025-12-04T11:59:23.3883066Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3883068Z 2025-12-04T11:59:23.3883142Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3883424Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3883426Z 2025-12-04T11:59:23.3883513Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3883577Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3883660Z ======================= 1 failed, 4 deselected in 8.93s ======================== 2025-12-04T11:59:23.3883697Z Got exit code 1 2025-12-04T11:59:23.3883737Z Retrying single test... 2025-12-04T11:59:23.3883940Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-3967fa06d3850202.xml 2025-12-04T11:59:23.3884000Z ============================= test session starts ============================== 2025-12-04T11:59:23.3884112Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3884152Z cachedir: .pytest_cache 2025-12-04T11:59:23.3884315Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3884362Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3884400Z configfile: pytest.ini 2025-12-04T11:59:23.3884566Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3884636Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3884907Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3884949Z Running 1 items in this shard 2025-12-04T11:59:23.3884953Z 2025-12-04T11:59:23.3885305Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:57:19.287000 210993 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 211062 2025-12-04T11:59:23.3885460Z I1204 11:57:19.288000 210993 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 211063 2025-12-04T11:59:23.3885613Z I1204 11:57:19.288000 210993 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 211064 2025-12-04T11:59:23.3885764Z I1204 11:57:19.289000 210993 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 211065 2025-12-04T11:59:23.3886290Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3886352Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3886840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3886905Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3887394Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3887451Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3887938Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3888016Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3888160Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3888325Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3888617Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3888776Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3889066Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3889189Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3889469Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3889654Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3889931Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3890081Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3890360Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3890521Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3890801Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3890948Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3891470Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3891587Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3891783Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3892189Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3892326Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3892537Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3892702Z [rank0]:E1204 11:57:26.313000 211062 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3892742Z dist init r=0, world=4 2025-12-04T11:59:23.3892879Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3893039Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3893330Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3893485Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3893774Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3893896Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3894174Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3894322Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3894600Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3894767Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3895043Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3895179Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3895459Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3895607Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3896120Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3896235Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3896431Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3896853Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3896970Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3897179Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3897342Z [rank1]:E1204 11:57:26.314000 211063 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3897383Z dist init r=1, world=4 2025-12-04T11:59:23.3897521Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3897679Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3897969Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3898123Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3898408Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3898534Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3898810Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3898977Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3911459Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3911623Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3911909Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3912052Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3912347Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3912499Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3913027Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3913199Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3913400Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3913812Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3913928Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3914147Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3914318Z [rank3]:E1204 11:57:26.315000 211065 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3914358Z dist init r=3, world=4 2025-12-04T11:59:23.3914503Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3914666Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3914960Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3915118Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3915410Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3915564Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3915845Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3915996Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3916276Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3916429Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3916709Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3916844Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3917128Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3917303Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3917827Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3917942Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3918139Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3918551Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3918666Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3918886Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3919051Z [rank2]:E1204 11:57:26.364000 211064 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3919092Z dist init r=2, world=4 2025-12-04T11:59:23.3919436Z [rank0]:[W1204 11:57:26.481518972 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3919483Z FAILED [8.8178s] [100%] 2025-12-04T11:59:23.3919486Z 2025-12-04T11:59:23.3919546Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3919724Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _ 2025-12-04T11:59:23.3919772Z Traceback (most recent call last): 2025-12-04T11:59:23.3919971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3920017Z self._join_processes(fn) 2025-12-04T11:59:23.3920194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3920250Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3920430Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3920477Z raise RuntimeError(error) 2025-12-04T11:59:23.3920559Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.3920606Z Traceback (most recent call last): 2025-12-04T11:59:23.3920769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3920812Z getattr(self, test_name)() 2025-12-04T11:59:23.3920975Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3921012Z fn() 2025-12-04T11:59:23.3921168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3921211Z method(*args, **kwargs) 2025-12-04T11:59:23.3921364Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3921436Z method(*args, **kwargs) 2025-12-04T11:59:23.3921590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3921628Z with policy(): 2025-12-04T11:59:23.3921784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3921827Z raise RuntimeError(msg) 2025-12-04T11:59:23.3922226Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3922229Z 2025-12-04T11:59:23.3922308Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3922599Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3922602Z 2025-12-04T11:59:23.3922692Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3922694Z 2025-12-04T11:59:23.3922696Z 2025-12-04T11:59:23.3922777Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3922870Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3923131Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-3967fa06d3850202.xml - 2025-12-04T11:59:23.3923193Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3923494Z FAILED [8.8178s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.3923542Z Traceback (most recent call last): 2025-12-04T11:59:23.3923712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3923758Z getattr(self, test_name)() 2025-12-04T11:59:23.3923942Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3923980Z fn() 2025-12-04T11:59:23.3924133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3924175Z method(*args, **kwargs) 2025-12-04T11:59:23.3924327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3924370Z method(*args, **kwargs) 2025-12-04T11:59:23.3924521Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3924560Z with policy(): 2025-12-04T11:59:23.3924718Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3924760Z raise RuntimeError(msg) 2025-12-04T11:59:23.3925153Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3925155Z 2025-12-04T11:59:23.3925232Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3925518Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3925545Z 2025-12-04T11:59:23.3925634Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3925701Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3925764Z ======================= 1 failed, 7 deselected in 8.83s ======================== 2025-12-04T11:59:23.3925805Z Got exit code 1 2025-12-04T11:59:23.3925844Z Retrying single test... 2025-12-04T11:59:23.3926056Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d90c84a1d19c859a.xml 2025-12-04T11:59:23.3926114Z ============================= test session starts ============================== 2025-12-04T11:59:23.3926232Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3926274Z cachedir: .pytest_cache 2025-12-04T11:59:23.3926439Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3926487Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3926529Z configfile: pytest.ini 2025-12-04T11:59:23.3926696Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3926773Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3927046Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3927089Z Running 1 items in this shard 2025-12-04T11:59:23.3927091Z 2025-12-04T11:59:23.3927443Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:57:30.695000 211395 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 211464 2025-12-04T11:59:23.3927602Z I1204 11:57:30.696000 211395 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 211465 2025-12-04T11:59:23.3927757Z I1204 11:57:30.696000 211395 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 211466 2025-12-04T11:59:23.3927931Z I1204 11:57:30.697000 211395 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 211467 2025-12-04T11:59:23.3928441Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3928507Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3929005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3929065Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3929565Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3929698Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3930195Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3930254Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3930399Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3930566Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3930861Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3931020Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3931312Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3931439Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3931722Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3931874Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3932155Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3932328Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3932610Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3932750Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3933030Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3933180Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3933702Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:59:23.3933818Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3934015Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3934454Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3934571Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3934786Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3934954Z [rank3]:E1204 11:57:37.875000 211467 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3934993Z dist init r=3, world=4 2025-12-04T11:59:23.3935133Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3935294Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3935587Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3935746Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3936032Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3936159Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3936439Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3936606Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3936887Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3937035Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3937314Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3937453Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3937734Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3937884Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3938402Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3938539Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3938739Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3939149Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3939262Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3939482Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3939686Z [rank1]:E1204 11:57:37.882000 211465 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3939726Z dist init r=1, world=4 2025-12-04T11:59:23.3939864Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3940031Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3940320Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3940476Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3940767Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3940892Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3941201Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3941350Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3941632Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3941783Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3942065Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3942207Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3942485Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3942662Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3943179Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3943297Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3943497Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3943905Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3944021Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3944234Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3944401Z [rank2]:E1204 11:57:37.901000 211466 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3944440Z dist init r=2, world=4 2025-12-04T11:59:23.3944579Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3944740Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3945036Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3945192Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3945497Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3945625Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3945906Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3946059Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3946339Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3946489Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3946770Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3946908Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3947208Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3947357Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3947882Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3947994Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3948195Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3948605Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3948718Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3948934Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3949097Z [rank0]:E1204 11:57:37.964000 211464 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3949136Z dist init r=0, world=4 2025-12-04T11:59:23.3949477Z [rank0]:[W1204 11:57:38.333104418 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3949518Z FAILED [9.0195s] [100%] 2025-12-04T11:59:23.3949539Z 2025-12-04T11:59:23.3949631Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3949768Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _ 2025-12-04T11:59:23.3949815Z Traceback (most recent call last): 2025-12-04T11:59:23.3949978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3950023Z self._join_processes(fn) 2025-12-04T11:59:23.3950197Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3950252Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3950430Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3950475Z raise RuntimeError(error) 2025-12-04T11:59:23.3950555Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3950600Z Traceback (most recent call last): 2025-12-04T11:59:23.3950763Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3950807Z getattr(self, test_name)() 2025-12-04T11:59:23.3950967Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3951038Z fn() 2025-12-04T11:59:23.3951191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3951230Z method(*args, **kwargs) 2025-12-04T11:59:23.3951384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3951423Z method(*args, **kwargs) 2025-12-04T11:59:23.3951580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3951616Z with policy(): 2025-12-04T11:59:23.3951770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3951810Z raise RuntimeError(msg) 2025-12-04T11:59:23.3952202Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:59:23.3952206Z 2025-12-04T11:59:23.3952281Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3952563Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3952566Z 2025-12-04T11:59:23.3952656Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3952658Z 2025-12-04T11:59:23.3952660Z 2025-12-04T11:59:23.3952735Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3952825Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3953077Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d90c84a1d19c859a.xml - 2025-12-04T11:59:23.3953138Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3953458Z FAILED [9.0195s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.3953503Z Traceback (most recent call last): 2025-12-04T11:59:23.3953669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3953713Z getattr(self, test_name)() 2025-12-04T11:59:23.3953874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3953911Z fn() 2025-12-04T11:59:23.3954063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3954103Z method(*args, **kwargs) 2025-12-04T11:59:23.3954255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3954297Z method(*args, **kwargs) 2025-12-04T11:59:23.3954450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3954486Z with policy(): 2025-12-04T11:59:23.3954643Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3954683Z raise RuntimeError(msg) 2025-12-04T11:59:23.3955073Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344. 2025-12-04T11:59:23.3955095Z 2025-12-04T11:59:23.3955169Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3955453Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3955454Z 2025-12-04T11:59:23.3955543Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3955610Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3955672Z ======================= 1 failed, 7 deselected in 9.03s ======================== 2025-12-04T11:59:23.3955710Z Got exit code 1 2025-12-04T11:59:23.3955937Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda 2025-12-04T11:59:23.3956069Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:59:23.3956276Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-303f589842289b36.xml 2025-12-04T11:59:23.3956334Z ============================= test session starts ============================== 2025-12-04T11:59:23.3956449Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3956489Z cachedir: .pytest_cache 2025-12-04T11:59:23.3956652Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3956698Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3956739Z configfile: pytest.ini 2025-12-04T11:59:23.3956906Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3956978Z collecting ... collected 8 items / 5 deselected / 3 selected 2025-12-04T11:59:23.3957031Z stepcurrent: skipping 5 already run items. 2025-12-04T11:59:23.3957075Z Running 3 items in this shard 2025-12-04T11:59:23.3957077Z 2025-12-04T11:59:23.3957462Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:57:42.383000 211797 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 211866 2025-12-04T11:59:23.3957622Z I1204 11:57:42.384000 211797 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 211867 2025-12-04T11:59:23.3957777Z I1204 11:57:42.384000 211797 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 211868 2025-12-04T11:59:23.3957934Z I1204 11:57:42.385000 211797 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 211869 2025-12-04T11:59:23.3958441Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3958503Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3958999Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3959082Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3959624Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3959685Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3960177Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3960235Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3960381Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3960548Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3960842Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3961000Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3961290Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3961416Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3961698Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3961871Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3962152Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3962300Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3962584Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3962723Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3963007Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3963157Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3963678Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3963822Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3964020Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3964430Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3964545Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3964759Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3964928Z [rank1]:E1204 11:57:49.491000 211867 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.3964966Z dist init r=1, world=4 2025-12-04T11:59:23.3965108Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3965269Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3965559Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3965715Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3966005Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3966152Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3966432Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3966580Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3966860Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3967009Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3967288Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3967428Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3967708Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3967875Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3968398Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3968511Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3968709Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3969117Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3969229Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3969444Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3969650Z [rank3]:E1204 11:57:49.500000 211869 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3969688Z dist init r=3, world=4 2025-12-04T11:59:23.3969826Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3969993Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3970281Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3970459Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3970747Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3970872Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3971155Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3971302Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3971585Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3971732Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3972014Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3972178Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3972459Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3972611Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3973127Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.3973243Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3973439Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3973848Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3973960Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3974174Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3974344Z [rank0]:E1204 11:57:49.565000 211866 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.3974381Z dist init r=0, world=4 2025-12-04T11:59:23.3974519Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3974698Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3974991Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3975144Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3975435Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3975558Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3975845Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3975994Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3976275Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3976442Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3976724Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3976864Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3977146Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3977298Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3977818Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3977933Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3978130Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3978537Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3978653Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3978865Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3979175Z [rank2]:E1204 11:57:49.588000 211868 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3979213Z dist init r=2, world=4 2025-12-04T11:59:23.3979556Z [rank0]:[W1204 11:57:49.902602146 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.3979641Z FAILED [8.9192s] [ 33%] 2025-12-04T11:59:23.3979645Z 2025-12-04T11:59:23.3979700Z =================================== FAILURES =================================== 2025-12-04T11:59:23.3979835Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _ 2025-12-04T11:59:23.3979880Z Traceback (most recent call last): 2025-12-04T11:59:23.3980045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.3980087Z self._join_processes(fn) 2025-12-04T11:59:23.3980265Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.3980318Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.3980497Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.3980540Z raise RuntimeError(error) 2025-12-04T11:59:23.3980649Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3980693Z Traceback (most recent call last): 2025-12-04T11:59:23.3980856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3980898Z getattr(self, test_name)() 2025-12-04T11:59:23.3981058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3981097Z fn() 2025-12-04T11:59:23.3981250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3981290Z method(*args, **kwargs) 2025-12-04T11:59:23.3981444Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3981483Z method(*args, **kwargs) 2025-12-04T11:59:23.3981635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3981678Z with policy(): 2025-12-04T11:59:23.3981833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3981872Z raise RuntimeError(msg) 2025-12-04T11:59:23.3982269Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3982272Z 2025-12-04T11:59:23.3982347Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3982630Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3982637Z 2025-12-04T11:59:23.3982725Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3982727Z 2025-12-04T11:59:23.3982729Z 2025-12-04T11:59:23.3982803Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.3982890Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.3983170Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-303f589842289b36.xml - 2025-12-04T11:59:23.3983230Z =========================== short test summary info ============================ 2025-12-04T11:59:23.3983524Z FAILED [8.9192s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.3983571Z Traceback (most recent call last): 2025-12-04T11:59:23.3983737Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3983779Z getattr(self, test_name)() 2025-12-04T11:59:23.3983941Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3983975Z fn() 2025-12-04T11:59:23.3984129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3984169Z method(*args, **kwargs) 2025-12-04T11:59:23.3984321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3984360Z method(*args, **kwargs) 2025-12-04T11:59:23.3984513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3984573Z with policy(): 2025-12-04T11:59:23.3984726Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3984767Z raise RuntimeError(msg) 2025-12-04T11:59:23.3985157Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.3985159Z 2025-12-04T11:59:23.3985235Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3985517Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3985521Z 2025-12-04T11:59:23.3985608Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3985671Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.3985733Z ======================= 1 failed, 5 deselected in 8.93s ======================== 2025-12-04T11:59:23.3985768Z Got exit code 1 2025-12-04T11:59:23.3985808Z Retrying single test... 2025-12-04T11:59:23.3986017Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-f15b22de1b66815c.xml 2025-12-04T11:59:23.3986074Z ============================= test session starts ============================== 2025-12-04T11:59:23.3986189Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.3986228Z cachedir: .pytest_cache 2025-12-04T11:59:23.3986391Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.3986437Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.3986477Z configfile: pytest.ini 2025-12-04T11:59:23.3986640Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.3986711Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.3987019Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3987062Z Running 1 items in this shard 2025-12-04T11:59:23.3987064Z 2025-12-04T11:59:23.3987416Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:57:54.005000 212199 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 212268 2025-12-04T11:59:23.3987575Z I1204 11:57:54.006000 212199 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 212269 2025-12-04T11:59:23.3987730Z I1204 11:57:54.006000 212199 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 212270 2025-12-04T11:59:23.3987884Z I1204 11:57:54.007000 212199 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 212271 2025-12-04T11:59:23.3988390Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3988451Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3988975Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3989034Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3989529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3989625Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3990119Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.3990175Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.3990323Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3990492Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3990786Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3990946Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3991235Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3991384Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3991665Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3991813Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3992094Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3992241Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3992521Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3992658Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3992937Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3993119Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3993652Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.3993768Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3993965Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3994374Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3994490Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3994704Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3994872Z [rank3]:E1204 11:58:01.186000 212271 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.3994911Z dist init r=3, world=4 2025-12-04T11:59:23.3995049Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3995211Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.3995505Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.3995678Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.3995966Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.3996092Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.3996372Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3996520Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3996802Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.3996950Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.3997228Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.3997392Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.3997672Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.3997821Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.3998340Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.3998455Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3998654Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.3999061Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.3999173Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.3999388Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.3999557Z [rank2]:E1204 11:58:01.194000 212270 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.3999634Z dist init r=2, world=4 2025-12-04T11:59:23.3999772Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.3999957Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4000247Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4000404Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4000697Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4000823Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4001105Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4001254Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4001534Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4001707Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4001987Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4002125Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4002406Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4002555Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4003074Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.4003193Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4003392Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4003801Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4003917Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4004131Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4004316Z [rank0]:E1204 11:58:01.263000 212268 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.4004356Z dist init r=0, world=4 2025-12-04T11:59:23.4004493Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4004653Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4004944Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4005098Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4005390Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4005515Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4005793Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4005962Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4006240Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4006389Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4006667Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4006805Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4007087Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4007236Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4007754Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.4007868Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4008065Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4008473Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4008604Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4008818Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4008982Z [rank1]:E1204 11:58:01.265000 212269 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.4009019Z dist init r=1, world=4 2025-12-04T11:59:23.4009361Z [rank0]:[W1204 11:58:01.568085854 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.4009399Z FAILED [9.0188s] [100%] 2025-12-04T11:59:23.4009401Z 2025-12-04T11:59:23.4009456Z =================================== FAILURES =================================== 2025-12-04T11:59:23.4009630Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _ 2025-12-04T11:59:23.4009675Z Traceback (most recent call last): 2025-12-04T11:59:23.4009839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.4009882Z self._join_processes(fn) 2025-12-04T11:59:23.4010056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.4010139Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.4010317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.4010361Z raise RuntimeError(error) 2025-12-04T11:59:23.4010438Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.4010484Z Traceback (most recent call last): 2025-12-04T11:59:23.4010645Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4010687Z getattr(self, test_name)() 2025-12-04T11:59:23.4010848Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4010883Z fn() 2025-12-04T11:59:23.4011033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4011078Z method(*args, **kwargs) 2025-12-04T11:59:23.4011229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4011268Z method(*args, **kwargs) 2025-12-04T11:59:23.4011420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4011456Z with policy(): 2025-12-04T11:59:23.4011613Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4011655Z raise RuntimeError(msg) 2025-12-04T11:59:23.4012054Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.4012058Z 2025-12-04T11:59:23.4012134Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4012414Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4012416Z 2025-12-04T11:59:23.4012525Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4012527Z 2025-12-04T11:59:23.4012529Z 2025-12-04T11:59:23.4012604Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.4012691Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.4012939Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-f15b22de1b66815c.xml - 2025-12-04T11:59:23.4012999Z =========================== short test summary info ============================ 2025-12-04T11:59:23.4013293Z FAILED [9.0188s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.4013336Z Traceback (most recent call last): 2025-12-04T11:59:23.4013503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4013544Z getattr(self, test_name)() 2025-12-04T11:59:23.4013706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4013742Z fn() 2025-12-04T11:59:23.4013893Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4013962Z method(*args, **kwargs) 2025-12-04T11:59:23.4014114Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4014152Z method(*args, **kwargs) 2025-12-04T11:59:23.4014302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4014340Z with policy(): 2025-12-04T11:59:23.4014494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4014536Z raise RuntimeError(msg) 2025-12-04T11:59:23.4014928Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.4014931Z 2025-12-04T11:59:23.4015005Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4015283Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4015285Z 2025-12-04T11:59:23.4015371Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4015434Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.4015496Z ======================= 1 failed, 7 deselected in 9.03s ======================== 2025-12-04T11:59:23.4015534Z Got exit code 1 2025-12-04T11:59:23.4015572Z Retrying single test... 2025-12-04T11:59:23.4015780Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d8c3dbf45df3e6f1.xml 2025-12-04T11:59:23.4015839Z ============================= test session starts ============================== 2025-12-04T11:59:23.4015952Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.4015991Z cachedir: .pytest_cache 2025-12-04T11:59:23.4016152Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.4016196Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.4016258Z configfile: pytest.ini 2025-12-04T11:59:23.4016422Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.4016494Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.4016766Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4016810Z Running 1 items in this shard 2025-12-04T11:59:23.4016813Z 2025-12-04T11:59:23.4017169Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:58:05.660000 212601 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 212670 2025-12-04T11:59:23.4017328Z I1204 11:58:05.661000 212601 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 212671 2025-12-04T11:59:23.4017483Z I1204 11:58:05.662000 212601 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 212672 2025-12-04T11:59:23.4017634Z I1204 11:58:05.662000 212601 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 212673 2025-12-04T11:59:23.4018131Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4018217Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4018716Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4018775Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4019275Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4019334Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4019859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4019916Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4020057Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4020222Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4020516Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4020701Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4020992Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4021117Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4021397Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4021546Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4021827Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4021974Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4022253Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4022418Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4022699Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4022851Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4023370Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.4023488Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4023688Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4024100Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4024216Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4024427Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4024597Z [rank0]:E1204 11:58:12.865000 212670 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.4024634Z dist init r=0, world=4 2025-12-04T11:59:23.4024774Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4024952Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4025243Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4025396Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4025682Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4025809Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4026093Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4026243Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4026523Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4026688Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4026969Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4027106Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4027386Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4027535Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4028054Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344. 2025-12-04T11:59:23.4028172Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4028370Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4028781Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4028895Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4029109Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4029307Z [rank3]:E1204 11:58:12.880000 212673 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.4029347Z dist init r=3, world=4 2025-12-04T11:59:23.4029483Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4029684Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4029975Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4030132Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4030425Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4030548Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4030829Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4030980Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4031287Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4031436Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4031715Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4031858Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4032138Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4032288Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4032804Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208. 2025-12-04T11:59:23.4032919Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4033116Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4033531Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4033667Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4033879Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4034046Z [rank1]:E1204 11:58:12.887000 212671 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.4034083Z dist init r=1, world=4 2025-12-04T11:59:23.4034224Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4034386Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4034677Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4034834Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4035122Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4035248Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4035543Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4035695Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4035975Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4036122Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4036398Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4036537Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4036826Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4036973Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4037489Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992. 2025-12-04T11:59:23.4037605Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4037800Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4038226Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4038338Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4038551Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4038717Z [rank2]:E1204 11:58:12.892000 212672 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.4038754Z dist init r=2, world=4 2025-12-04T11:59:23.4039094Z [rank0]:[W1204 11:58:13.031702599 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T11:59:23.4039132Z FAILED [9.1209s] [100%] 2025-12-04T11:59:23.4039134Z 2025-12-04T11:59:23.4039188Z =================================== FAILURES =================================== 2025-12-04T11:59:23.4039323Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _ 2025-12-04T11:59:23.4039367Z Traceback (most recent call last): 2025-12-04T11:59:23.4039553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.4039626Z self._join_processes(fn) 2025-12-04T11:59:23.4039799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.4039854Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.4040034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.4040076Z raise RuntimeError(error) 2025-12-04T11:59:23.4040156Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.4040202Z Traceback (most recent call last): 2025-12-04T11:59:23.4040364Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4040408Z getattr(self, test_name)() 2025-12-04T11:59:23.4040567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4040601Z fn() 2025-12-04T11:59:23.4040751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4040791Z method(*args, **kwargs) 2025-12-04T11:59:23.4040944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4040985Z method(*args, **kwargs) 2025-12-04T11:59:23.4041138Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4041176Z with policy(): 2025-12-04T11:59:23.4041330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4041373Z raise RuntimeError(msg) 2025-12-04T11:59:23.4041772Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.4041775Z 2025-12-04T11:59:23.4041877Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4042162Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4042164Z 2025-12-04T11:59:23.4042252Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4042254Z 2025-12-04T11:59:23.4042256Z 2025-12-04T11:59:23.4042333Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.4042423Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.4042678Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-d8c3dbf45df3e6f1.xml - 2025-12-04T11:59:23.4042738Z =========================== short test summary info ============================ 2025-12-04T11:59:23.4043038Z FAILED [9.1209s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T11:59:23.4043083Z Traceback (most recent call last): 2025-12-04T11:59:23.4043249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4043294Z getattr(self, test_name)() 2025-12-04T11:59:23.4043481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4043517Z fn() 2025-12-04T11:59:23.4043670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4043711Z method(*args, **kwargs) 2025-12-04T11:59:23.4043865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4043905Z method(*args, **kwargs) 2025-12-04T11:59:23.4044058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4044093Z with policy(): 2025-12-04T11:59:23.4044246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4044287Z raise RuntimeError(msg) 2025-12-04T11:59:23.4044680Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544. 2025-12-04T11:59:23.4044682Z 2025-12-04T11:59:23.4044757Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4045039Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4045043Z 2025-12-04T11:59:23.4045129Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4045195Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.4045256Z ======================= 1 failed, 7 deselected in 9.13s ======================== 2025-12-04T11:59:23.4045297Z Got exit code 1 2025-12-04T11:59:23.4045525Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda 2025-12-04T11:59:23.4045655Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:59:23.4045888Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-407fc34d7680a29b.xml 2025-12-04T11:59:23.4045947Z ============================= test session starts ============================== 2025-12-04T11:59:23.4046058Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.4046099Z cachedir: .pytest_cache 2025-12-04T11:59:23.4046261Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.4046308Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.4046347Z configfile: pytest.ini 2025-12-04T11:59:23.4046511Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.4046582Z collecting ... collected 8 items / 6 deselected / 2 selected 2025-12-04T11:59:23.4046635Z stepcurrent: skipping 6 already run items. 2025-12-04T11:59:23.4046678Z Running 2 items in this shard 2025-12-04T11:59:23.4046680Z 2025-12-04T11:59:23.4046986Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:58:17.322000 213003 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 213072 2025-12-04T11:59:23.4047143Z I1204 11:58:17.323000 213003 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 213073 2025-12-04T11:59:23.4047318Z I1204 11:58:17.323000 213003 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 213074 2025-12-04T11:59:23.4047471Z I1204 11:58:17.324000 213003 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 213075 2025-12-04T11:59:23.4047978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4048041Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4048537Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4048599Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4049098Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4049154Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4049689Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4049746Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4049893Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4050082Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4050378Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4050537Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4050828Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4050955Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4051238Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4051389Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4051668Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4051840Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4052119Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4052258Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4052541Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4052689Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4053172Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:59:23.4053292Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4053487Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4053843Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4053959Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4054172Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4054360Z [rank2]:E1204 11:58:24.491000 213074 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.4054399Z dist init r=2, world=4 2025-12-04T11:59:23.4054536Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4054696Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4054983Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4055139Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4055429Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4055554Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4055833Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4055980Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4056281Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4056428Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4056710Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4056849Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4057131Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4057283Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4057753Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4057868Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4058065Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4058419Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4058535Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4058765Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4058933Z [rank1]:E1204 11:58:24.498000 213073 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.4058972Z dist init r=1, world=4 2025-12-04T11:59:23.4059111Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4059273Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4059566Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4059815Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4060100Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4060225Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4060537Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4060685Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4060965Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4061115Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4061393Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4061535Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4061821Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4061973Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4062441Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4062556Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4062755Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4063135Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4063248Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4063462Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4063628Z [rank3]:E1204 11:58:24.504000 213075 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.4063670Z dist init r=3, world=4 2025-12-04T11:59:23.4063807Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4063972Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4064264Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4064421Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4064709Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4064850Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4065131Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4065279Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4065560Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4065706Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4065987Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4066124Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4066405Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4066555Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4067020Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:59:23.4067134Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4067347Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4067697Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4067809Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4068023Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4068189Z [rank0]:E1204 11:58:24.572000 213072 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.4068226Z dist init r=0, world=4 2025-12-04T11:59:23.4068264Z FAILED [8.3162s] [ 50%] 2025-12-04T11:59:23.4068268Z 2025-12-04T11:59:23.4068323Z =================================== FAILURES =================================== 2025-12-04T11:59:23.4068418Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________ 2025-12-04T11:59:23.4068464Z Traceback (most recent call last): 2025-12-04T11:59:23.4068629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.4068690Z self._join_processes(fn) 2025-12-04T11:59:23.4068863Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.4068916Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.4069098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.4069140Z raise RuntimeError(error) 2025-12-04T11:59:23.4069222Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4069265Z Traceback (most recent call last): 2025-12-04T11:59:23.4069429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4069470Z getattr(self, test_name)() 2025-12-04T11:59:23.4069664Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4069701Z fn() 2025-12-04T11:59:23.4069853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4069893Z method(*args, **kwargs) 2025-12-04T11:59:23.4070045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4070085Z method(*args, **kwargs) 2025-12-04T11:59:23.4070237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4070274Z with policy(): 2025-12-04T11:59:23.4070427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4070468Z raise RuntimeError(msg) 2025-12-04T11:59:23.4070811Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4070815Z 2025-12-04T11:59:23.4070891Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4071120Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4071146Z 2025-12-04T11:59:23.4071234Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4071237Z 2025-12-04T11:59:23.4071238Z 2025-12-04T11:59:23.4071314Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.4071401Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.4071656Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-407fc34d7680a29b.xml - 2025-12-04T11:59:23.4071717Z =========================== short test summary info ============================ 2025-12-04T11:59:23.4071959Z FAILED [8.3162s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4072002Z Traceback (most recent call last): 2025-12-04T11:59:23.4072171Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4072211Z getattr(self, test_name)() 2025-12-04T11:59:23.4072375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4072408Z fn() 2025-12-04T11:59:23.4072561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4072624Z method(*args, **kwargs) 2025-12-04T11:59:23.4072776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4072814Z method(*args, **kwargs) 2025-12-04T11:59:23.4072966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4073003Z with policy(): 2025-12-04T11:59:23.4073160Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4073200Z raise RuntimeError(msg) 2025-12-04T11:59:23.4073543Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4073547Z 2025-12-04T11:59:23.4073621Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4073846Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4073848Z 2025-12-04T11:59:23.4073936Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4073999Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.4074061Z ======================= 1 failed, 6 deselected in 8.33s ======================== 2025-12-04T11:59:23.4074096Z Got exit code 1 2025-12-04T11:59:23.4074135Z Retrying single test... 2025-12-04T11:59:23.4074342Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-46c3c722696407a3.xml 2025-12-04T11:59:23.4074399Z ============================= test session starts ============================== 2025-12-04T11:59:23.4074513Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.4074554Z cachedir: .pytest_cache 2025-12-04T11:59:23.4074713Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.4074760Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.4074798Z configfile: pytest.ini 2025-12-04T11:59:23.4074986Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.4075058Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.4075276Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4075319Z Running 1 items in this shard 2025-12-04T11:59:23.4075323Z 2025-12-04T11:59:23.4075627Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:58:28.202000 213397 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 213466 2025-12-04T11:59:23.4075786Z I1204 11:58:28.203000 213397 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 213467 2025-12-04T11:59:23.4075941Z I1204 11:58:28.203000 213397 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 213468 2025-12-04T11:59:23.4076092Z I1204 11:58:28.204000 213397 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 213469 2025-12-04T11:59:23.4076596Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4076683Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4077181Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4077239Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4077732Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4077790Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4078283Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4078339Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4078483Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4078646Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4078943Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4079099Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4079405Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4079532Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4079843Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4079994Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4080274Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4080423Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4080704Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4080841Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4081146Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4081294Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4081768Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4081884Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4082086Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4082440Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4082554Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4082766Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4082934Z [rank3]:E1204 11:58:35.392000 213469 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.4082973Z dist init r=3, world=4 2025-12-04T11:59:23.4083113Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4083273Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4083587Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4083740Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4084027Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4084152Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4084431Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4084581Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4084859Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4085007Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4085286Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4085442Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4085722Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4085871Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4086338Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4086452Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4086653Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4087006Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4087121Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4087339Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4087509Z [rank1]:E1204 11:58:35.395000 213467 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.4087547Z dist init r=1, world=4 2025-12-04T11:59:23.4087685Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4087865Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4088156Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4088313Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4088599Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4088725Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4089009Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4089158Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4089440Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4089664Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4089945Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4090084Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4090365Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4090515Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4090987Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:59:23.4091103Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4091301Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4091655Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4091769Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4091985Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4092180Z [rank2]:E1204 11:58:35.458000 213468 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.4092220Z dist init r=2, world=4 2025-12-04T11:59:23.4092357Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4092520Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4092811Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4092967Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4093258Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4093381Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4093662Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4093834Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4094114Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4094262Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4094540Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4094677Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4094959Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4095112Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4095581Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:59:23.4095696Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4095895Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4096248Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4096360Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4096589Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4096756Z [rank0]:E1204 11:58:35.464000 213466 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.4096792Z dist init r=0, world=4 2025-12-04T11:59:23.4096830Z FAILED [8.4182s] [100%] 2025-12-04T11:59:23.4096832Z 2025-12-04T11:59:23.4096888Z =================================== FAILURES =================================== 2025-12-04T11:59:23.4096985Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________ 2025-12-04T11:59:23.4097029Z Traceback (most recent call last): 2025-12-04T11:59:23.4097191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.4097233Z self._join_processes(fn) 2025-12-04T11:59:23.4097408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.4097461Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.4097640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.4097684Z raise RuntimeError(error) 2025-12-04T11:59:23.4097762Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4097825Z Traceback (most recent call last): 2025-12-04T11:59:23.4097987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4098028Z getattr(self, test_name)() 2025-12-04T11:59:23.4098188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4098222Z fn() 2025-12-04T11:59:23.4098376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4098416Z method(*args, **kwargs) 2025-12-04T11:59:23.4098567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4098606Z method(*args, **kwargs) 2025-12-04T11:59:23.4098758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4098797Z with policy(): 2025-12-04T11:59:23.4098950Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4098991Z raise RuntimeError(msg) 2025-12-04T11:59:23.4099341Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4099343Z 2025-12-04T11:59:23.4099418Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4099686Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4099689Z 2025-12-04T11:59:23.4099776Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4099780Z 2025-12-04T11:59:23.4099840Z Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.4099883Z Traceback (most recent call last): 2025-12-04T11:59:23.4100048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4100088Z getattr(self, test_name)() 2025-12-04T11:59:23.4100275Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4100309Z fn() 2025-12-04T11:59:23.4100461Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4100499Z method(*args, **kwargs) 2025-12-04T11:59:23.4100652Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4100691Z method(*args, **kwargs) 2025-12-04T11:59:23.4100843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4100878Z with policy(): 2025-12-04T11:59:23.4101032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4101072Z raise RuntimeError(msg) 2025-12-04T11:59:23.4101414Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4101416Z 2025-12-04T11:59:23.4101491Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4101718Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4101744Z 2025-12-04T11:59:23.4101831Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4101834Z 2025-12-04T11:59:23.4101835Z 2025-12-04T11:59:23.4101909Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.4101997Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.4102249Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-46c3c722696407a3.xml - 2025-12-04T11:59:23.4102310Z =========================== short test summary info ============================ 2025-12-04T11:59:23.4102552Z FAILED [8.4182s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4102599Z Traceback (most recent call last): 2025-12-04T11:59:23.4102766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4102807Z getattr(self, test_name)() 2025-12-04T11:59:23.4102971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4103005Z fn() 2025-12-04T11:59:23.4103159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4103197Z method(*args, **kwargs) 2025-12-04T11:59:23.4103351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4103389Z method(*args, **kwargs) 2025-12-04T11:59:23.4103540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4103579Z with policy(): 2025-12-04T11:59:23.4103732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4103772Z raise RuntimeError(msg) 2025-12-04T11:59:23.4104131Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4104133Z 2025-12-04T11:59:23.4104204Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4104427Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4104429Z 2025-12-04T11:59:23.4104517Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4104521Z 2025-12-04T11:59:23.4104580Z Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.4104623Z Traceback (most recent call last): 2025-12-04T11:59:23.4104785Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4104827Z getattr(self, test_name)() 2025-12-04T11:59:23.4104990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4105023Z fn() 2025-12-04T11:59:23.4105174Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4105214Z method(*args, **kwargs) 2025-12-04T11:59:23.4105365Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4105423Z method(*args, **kwargs) 2025-12-04T11:59:23.4105576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4105612Z with policy(): 2025-12-04T11:59:23.4105766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4105806Z raise RuntimeError(msg) 2025-12-04T11:59:23.4106149Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4106151Z 2025-12-04T11:59:23.4106224Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4106447Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4106452Z 2025-12-04T11:59:23.4106538Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4106601Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.4106662Z ======================= 1 failed, 7 deselected in 8.43s ======================== 2025-12-04T11:59:23.4106698Z Got exit code 1 2025-12-04T11:59:23.4106738Z Retrying single test... 2025-12-04T11:59:23.4106947Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-3760c1e6f8841a9d.xml 2025-12-04T11:59:23.4107003Z ============================= test session starts ============================== 2025-12-04T11:59:23.4107120Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.4107159Z cachedir: .pytest_cache 2025-12-04T11:59:23.4107320Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.4107367Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.4107406Z configfile: pytest.ini 2025-12-04T11:59:23.4107570Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.4107641Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.4107882Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4107925Z Running 1 items in this shard 2025-12-04T11:59:23.4107927Z 2025-12-04T11:59:23.4108228Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:58:39.310000 213791 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 213860 2025-12-04T11:59:23.4108387Z I1204 11:58:39.311000 213791 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 213861 2025-12-04T11:59:23.4108540Z I1204 11:58:39.311000 213791 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 213862 2025-12-04T11:59:23.4108692Z I1204 11:58:39.312000 213791 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 213863 2025-12-04T11:59:23.4109198Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4109259Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4109815Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4109875Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4110371Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4110428Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4110920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4110978Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4111124Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4111292Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4111587Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4111747Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4112038Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4112193Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4112475Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4112624Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4112909Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4113058Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4113339Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4113476Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4113758Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4113935Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4114403Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4114520Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4114721Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4115072Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4115188Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4115402Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4115569Z [rank1]:E1204 11:58:46.353000 213861 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.4115606Z dist init r=1, world=4 2025-12-04T11:59:23.4115744Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4115904Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4116196Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4116351Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4116654Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4116779Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4117061Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4117211Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4117493Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4117644Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4117926Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4118062Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4118363Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4118511Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4118980Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4119094Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4119294Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4119682Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4119795Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4120009Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4120174Z [rank3]:E1204 11:58:46.361000 213863 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.4120213Z dist init r=3, world=4 2025-12-04T11:59:23.4120351Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4120513Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4120826Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4120979Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4121266Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4121390Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4121677Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4121828Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4122106Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4122252Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4122561Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4122698Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4122979Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4123128Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4123594Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:59:23.4123711Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4123907Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4124262Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4124375Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4124587Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4124753Z [rank2]:E1204 11:58:46.406000 213862 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.4124790Z dist init r=2, world=4 2025-12-04T11:59:23.4124927Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4125105Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4125397Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4125550Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4125839Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4125962Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4126244Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4126394Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4126671Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4126838Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4127118Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4127256Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4127536Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4127684Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4128153Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:59:23.4128267Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4128466Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4128817Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4128932Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4129146Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4129327Z [rank0]:E1204 11:58:46.409000 213860 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.4129366Z dist init r=0, world=4 2025-12-04T11:59:23.4129404Z FAILED [8.3182s] [100%] 2025-12-04T11:59:23.4129406Z 2025-12-04T11:59:23.4129461Z =================================== FAILURES =================================== 2025-12-04T11:59:23.4129554Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________ 2025-12-04T11:59:23.4129643Z Traceback (most recent call last): 2025-12-04T11:59:23.4129809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.4129853Z self._join_processes(fn) 2025-12-04T11:59:23.4130027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.4130082Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.4130263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.4130305Z raise RuntimeError(error) 2025-12-04T11:59:23.4130384Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4130429Z Traceback (most recent call last): 2025-12-04T11:59:23.4130589Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4130656Z getattr(self, test_name)() 2025-12-04T11:59:23.4130816Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4130852Z fn() 2025-12-04T11:59:23.4131003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4131043Z method(*args, **kwargs) 2025-12-04T11:59:23.4131196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4131236Z method(*args, **kwargs) 2025-12-04T11:59:23.4131391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4131426Z with policy(): 2025-12-04T11:59:23.4131583Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4131624Z raise RuntimeError(msg) 2025-12-04T11:59:23.4131967Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4131969Z 2025-12-04T11:59:23.4132043Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4132270Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4132272Z 2025-12-04T11:59:23.4132358Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4132360Z 2025-12-04T11:59:23.4132419Z Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.4132462Z Traceback (most recent call last): 2025-12-04T11:59:23.4132628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4132669Z getattr(self, test_name)() 2025-12-04T11:59:23.4132831Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4132866Z fn() 2025-12-04T11:59:23.4133044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4133164Z method(*args, **kwargs) 2025-12-04T11:59:23.4133320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4133360Z method(*args, **kwargs) 2025-12-04T11:59:23.4133513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4133549Z with policy(): 2025-12-04T11:59:23.4133705Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4133746Z raise RuntimeError(msg) 2025-12-04T11:59:23.4134083Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4134087Z 2025-12-04T11:59:23.4134162Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4134385Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4134387Z 2025-12-04T11:59:23.4134474Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4134476Z 2025-12-04T11:59:23.4134502Z 2025-12-04T11:59:23.4134576Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.4134662Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.4134913Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-3760c1e6f8841a9d.xml - 2025-12-04T11:59:23.4134973Z =========================== short test summary info ============================ 2025-12-04T11:59:23.4135217Z FAILED [8.3182s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4135261Z Traceback (most recent call last): 2025-12-04T11:59:23.4135427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4135468Z getattr(self, test_name)() 2025-12-04T11:59:23.4135636Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4135669Z fn() 2025-12-04T11:59:23.4135824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4135862Z method(*args, **kwargs) 2025-12-04T11:59:23.4136016Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4136054Z method(*args, **kwargs) 2025-12-04T11:59:23.4136206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4136241Z with policy(): 2025-12-04T11:59:23.4136393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4136433Z raise RuntimeError(msg) 2025-12-04T11:59:23.4136774Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4136776Z 2025-12-04T11:59:23.4136849Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4137091Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4137093Z 2025-12-04T11:59:23.4137180Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4137182Z 2025-12-04T11:59:23.4137239Z Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.4137284Z Traceback (most recent call last): 2025-12-04T11:59:23.4137446Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4137489Z getattr(self, test_name)() 2025-12-04T11:59:23.4137650Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4137683Z fn() 2025-12-04T11:59:23.4137837Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4137877Z method(*args, **kwargs) 2025-12-04T11:59:23.4138030Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4138068Z method(*args, **kwargs) 2025-12-04T11:59:23.4138223Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4140096Z with policy(): 2025-12-04T11:59:23.4140262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4140356Z raise RuntimeError(msg) 2025-12-04T11:59:23.4140708Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4140710Z 2025-12-04T11:59:23.4140787Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4141014Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4141016Z 2025-12-04T11:59:23.4141102Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4141166Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.4141232Z ======================= 1 failed, 7 deselected in 8.33s ======================== 2025-12-04T11:59:23.4141270Z Got exit code 1 2025-12-04T11:59:23.4141445Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda 2025-12-04T11:59:23.4141574Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:59:23.4141784Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-e1a1f11b70cb3331.xml 2025-12-04T11:59:23.4141844Z ============================= test session starts ============================== 2025-12-04T11:59:23.4141959Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.4141999Z cachedir: .pytest_cache 2025-12-04T11:59:23.4142160Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.4142209Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.4142249Z configfile: pytest.ini 2025-12-04T11:59:23.4142415Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.4142486Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.4142575Z stepcurrent: skipping 7 already run items. 2025-12-04T11:59:23.4142619Z Running 1 items in this shard 2025-12-04T11:59:23.4142621Z 2025-12-04T11:59:23.4142927Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:58:50.031000 214185 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 214254 2025-12-04T11:59:23.4143084Z I1204 11:58:50.031000 214185 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 214255 2025-12-04T11:59:23.4143243Z I1204 11:58:50.032000 214185 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 214256 2025-12-04T11:59:23.4143395Z I1204 11:58:50.032000 214185 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 214257 2025-12-04T11:59:23.4143905Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4143966Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4144467Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4144552Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4145050Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4145108Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4145601Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4145660Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4145806Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4145973Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4146270Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4146429Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4146723Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4146851Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4147151Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4147302Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4147583Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4147733Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4148012Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4148149Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4148434Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4148583Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4149079Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4149199Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4149398Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4149808Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4149924Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4150140Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4150307Z [rank1]:E1204 11:58:57.230000 214255 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.4150347Z dist init r=1, world=4 2025-12-04T11:59:23.4150485Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4150648Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4150939Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4151096Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4151417Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4151543Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4151823Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4151974Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4152253Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4152404Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4152683Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4152821Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4153101Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4153279Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4153749Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:59:23.4153866Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4154065Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4154421Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4154535Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4154749Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4154917Z [rank2]:E1204 11:58:57.237000 214256 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.4154954Z dist init r=2, world=4 2025-12-04T11:59:23.4155094Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4155256Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4155544Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4155728Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4156015Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4156139Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4156426Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4156576Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4156857Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4157006Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4157287Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4159062Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4159344Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4159496Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4160010Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:59:23.4160126Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4160327Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4160685Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4160797Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4161013Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4161179Z [rank0]:E1204 11:58:57.238000 214254 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.4161217Z dist init r=0, world=4 2025-12-04T11:59:23.4161354Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4161516Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4161836Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4161994Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4162282Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4162406Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4162691Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4162838Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4163119Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4163266Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4163573Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4163709Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4163991Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4164140Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4164612Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4164727Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4164926Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4165280Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4165392Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4165606Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4165773Z [rank3]:E1204 11:58:57.293000 214257 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.4165809Z dist init r=3, world=4 2025-12-04T11:59:23.4165866Z FAILED [8.3175s] [100%] 2025-12-04T11:59:23.4165868Z 2025-12-04T11:59:23.4165924Z =================================== FAILURES =================================== 2025-12-04T11:59:23.4166023Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________ 2025-12-04T11:59:23.4166068Z Traceback (most recent call last): 2025-12-04T11:59:23.4166233Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.4166277Z self._join_processes(fn) 2025-12-04T11:59:23.4166453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.4166505Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.4166686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.4166732Z raise RuntimeError(error) 2025-12-04T11:59:23.4166811Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4166855Z Traceback (most recent call last): 2025-12-04T11:59:23.4167017Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4167060Z getattr(self, test_name)() 2025-12-04T11:59:23.4167220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4167275Z fn() 2025-12-04T11:59:23.4167426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4167467Z method(*args, **kwargs) 2025-12-04T11:59:23.4167620Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4167660Z method(*args, **kwargs) 2025-12-04T11:59:23.4167813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4167850Z with policy(): 2025-12-04T11:59:23.4168003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4168044Z raise RuntimeError(msg) 2025-12-04T11:59:23.4168384Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4168388Z 2025-12-04T11:59:23.4168463Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4168688Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4168691Z 2025-12-04T11:59:23.4168778Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4168780Z 2025-12-04T11:59:23.4168782Z 2025-12-04T11:59:23.4168857Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.4168944Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.4169202Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-e1a1f11b70cb3331.xml - 2025-12-04T11:59:23.4169263Z =========================== short test summary info ============================ 2025-12-04T11:59:23.4169510Z FAILED [8.3175s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4169554Z Traceback (most recent call last): 2025-12-04T11:59:23.4169798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4169839Z getattr(self, test_name)() 2025-12-04T11:59:23.4169999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4170032Z fn() 2025-12-04T11:59:23.4170186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4170226Z method(*args, **kwargs) 2025-12-04T11:59:23.4170380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4170418Z method(*args, **kwargs) 2025-12-04T11:59:23.4170570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4170607Z with policy(): 2025-12-04T11:59:23.4170761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4170803Z raise RuntimeError(msg) 2025-12-04T11:59:23.4171142Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4171170Z 2025-12-04T11:59:23.4171245Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4171470Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4171473Z 2025-12-04T11:59:23.4171560Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4171624Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.4171687Z ======================= 1 failed, 7 deselected in 8.33s ======================== 2025-12-04T11:59:23.4171723Z Got exit code 1 2025-12-04T11:59:23.4171763Z Retrying single test... 2025-12-04T11:59:23.4171977Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-8927b7ab7fcfe23b.xml 2025-12-04T11:59:23.4172034Z ============================= test session starts ============================== 2025-12-04T11:59:23.4172148Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.4172189Z cachedir: .pytest_cache 2025-12-04T11:59:23.4172350Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.4172394Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.4172434Z configfile: pytest.ini 2025-12-04T11:59:23.4172600Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.4172673Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.4172893Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4172939Z Running 1 items in this shard 2025-12-04T11:59:23.4172942Z 2025-12-04T11:59:23.4173247Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:59:00.957000 214579 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 214648 2025-12-04T11:59:23.4173403Z I1204 11:59:00.958000 214579 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 214649 2025-12-04T11:59:23.4173586Z I1204 11:59:00.958000 214579 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 214650 2025-12-04T11:59:23.4173740Z I1204 11:59:00.959000 214579 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 214651 2025-12-04T11:59:23.4174256Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4174318Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4174817Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4174876Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4175370Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4175446Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4175938Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4175995Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4176140Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4176307Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4176607Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4176766Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4177058Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4177185Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4177465Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4177616Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4177915Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4178063Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4178342Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4178480Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4178762Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4178914Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4179387Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4179503Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4179764Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4180117Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4180234Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4180447Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4180615Z [rank3]:E1204 11:59:08.072000 214651 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.4180654Z dist init r=3, world=4 2025-12-04T11:59:23.4180794Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4180955Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4181246Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4181403Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4181692Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4181819Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4182102Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4182279Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4182558Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4182706Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4182988Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4183125Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4183409Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4183560Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4184035Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4184180Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4184378Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4184732Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4184844Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4185058Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4185225Z [rank1]:E1204 11:59:08.076000 214649 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.4185264Z dist init r=1, world=4 2025-12-04T11:59:23.4185402Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4185564Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4185853Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4186007Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4186301Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4186424Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4186722Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4186871Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4187149Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4187298Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4187578Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4187717Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4187996Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4188145Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4188645Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:59:23.4188763Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4188960Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4189312Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4189428Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4189674Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4189841Z [rank0]:E1204 11:59:08.090000 214648 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.4189879Z dist init r=0, world=4 2025-12-04T11:59:23.4190019Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4190181Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4190473Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4190629Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4190949Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4191074Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4191352Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4191503Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4191783Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4191933Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4192211Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4192348Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4192629Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4192803Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4193275Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:59:23.4193389Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4193587Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4193942Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4194054Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4194272Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4194437Z [rank2]:E1204 11:59:08.092000 214650 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.4194476Z dist init r=2, world=4 2025-12-04T11:59:23.4194513Z FAILED [8.3183s] [100%] 2025-12-04T11:59:23.4194517Z 2025-12-04T11:59:23.4194572Z =================================== FAILURES =================================== 2025-12-04T11:59:23.4194668Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________ 2025-12-04T11:59:23.4194713Z Traceback (most recent call last): 2025-12-04T11:59:23.4194875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.4194919Z self._join_processes(fn) 2025-12-04T11:59:23.4195112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.4195167Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.4195347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.4195390Z raise RuntimeError(error) 2025-12-04T11:59:23.4195469Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4195516Z Traceback (most recent call last): 2025-12-04T11:59:23.4195678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4195721Z getattr(self, test_name)() 2025-12-04T11:59:23.4195881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4195915Z fn() 2025-12-04T11:59:23.4196069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4196108Z method(*args, **kwargs) 2025-12-04T11:59:23.4196262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4196300Z method(*args, **kwargs) 2025-12-04T11:59:23.4196452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4196512Z with policy(): 2025-12-04T11:59:23.4196666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4196705Z raise RuntimeError(msg) 2025-12-04T11:59:23.4197047Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4197049Z 2025-12-04T11:59:23.4197124Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4197346Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4197348Z 2025-12-04T11:59:23.4197437Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4197440Z 2025-12-04T11:59:23.4197442Z 2025-12-04T11:59:23.4197515Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.4197603Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.4197857Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-8927b7ab7fcfe23b.xml - 2025-12-04T11:59:23.4197917Z =========================== short test summary info ============================ 2025-12-04T11:59:23.4198159Z FAILED [8.3183s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T11:59:23.4198205Z Traceback (most recent call last): 2025-12-04T11:59:23.4198369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4198412Z getattr(self, test_name)() 2025-12-04T11:59:23.4198572Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4198606Z fn() 2025-12-04T11:59:23.4198759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4198821Z method(*args, **kwargs) 2025-12-04T11:59:23.4198977Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4199016Z method(*args, **kwargs) 2025-12-04T11:59:23.4199167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4199204Z with policy(): 2025-12-04T11:59:23.4199357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4199399Z raise RuntimeError(msg) 2025-12-04T11:59:23.4199785Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4199788Z 2025-12-04T11:59:23.4199863Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4200086Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4200088Z 2025-12-04T11:59:23.4200175Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4200238Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.4200326Z ======================= 1 failed, 7 deselected in 8.33s ======================== 2025-12-04T11:59:23.4200362Z Got exit code 1 2025-12-04T11:59:23.4200400Z Retrying single test... 2025-12-04T11:59:23.4200608Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-e8a73bea9050c490.xml 2025-12-04T11:59:23.4200664Z ============================= test session starts ============================== 2025-12-04T11:59:23.4200779Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.4200819Z cachedir: .pytest_cache 2025-12-04T11:59:23.4200979Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.4201023Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.4201064Z configfile: pytest.ini 2025-12-04T11:59:23.4201227Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.4201300Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T11:59:23.4201518Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4201560Z Running 1 items in this shard 2025-12-04T11:59:23.4201562Z 2025-12-04T11:59:23.4201868Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:59:11.821000 214973 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 215042 2025-12-04T11:59:23.4202025Z I1204 11:59:11.822000 214973 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 215043 2025-12-04T11:59:23.4202180Z I1204 11:59:11.822000 214973 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 215044 2025-12-04T11:59:23.4202333Z I1204 11:59:11.823000 214973 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 215045 2025-12-04T11:59:23.4202866Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4202927Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4203425Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4203486Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4203981Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4204039Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4204532Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T11:59:23.4204619Z device_from_device_id = _get_device_from_device_id( 2025-12-04T11:59:23.4204763Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4204927Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4205221Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4205375Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4205664Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4205791Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4206077Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4206227Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4206507Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4206656Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4206934Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4207091Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4207372Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4207521Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4207993Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696. 2025-12-04T11:59:23.4208110Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4208309Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4208662Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4208775Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4209008Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4209174Z [rank0]:E1204 11:59:18.963000 215042 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T11:59:23.4209211Z dist init r=0, world=4 2025-12-04T11:59:23.4209351Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4209513Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4209842Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4209999Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4210286Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4210413Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4210693Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4210842Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4211123Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4211270Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4211576Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4211713Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4211997Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4212149Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4212624Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144. 2025-12-04T11:59:23.4212739Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4212935Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4213289Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4213429Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4213644Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4213809Z [rank2]:E1204 11:59:18.971000 215044 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T11:59:23.4213847Z dist init r=2, world=4 2025-12-04T11:59:23.4213985Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4214147Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4214441Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4214594Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4214884Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4215006Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4215287Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4215436Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4215739Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4215887Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4216164Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4216301Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4216586Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4216735Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4217204Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4217319Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4217537Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4217890Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4218003Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4218217Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4218383Z [rank3]:E1204 11:59:18.996000 215045 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T11:59:23.4218422Z dist init r=3, world=4 2025-12-04T11:59:23.4218560Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T11:59:23.4218721Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T11:59:23.4219017Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4219173Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T11:59:23.4219459Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4219621Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T11:59:23.4219899Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4220082Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4220359Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4220507Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T11:59:23.4220787Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4220924Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T11:59:23.4221206Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4221355Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T11:59:23.4221827Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360. 2025-12-04T11:59:23.4221963Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4222162Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4222514Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4222625Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T11:59:23.4222839Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4223005Z [rank1]:E1204 11:59:19.004000 215043 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T11:59:23.4223043Z dist init r=1, world=4 2025-12-04T11:59:23.4223079Z FAILED [8.4184s] [100%] 2025-12-04T11:59:23.4223081Z 2025-12-04T11:59:23.4223138Z =================================== FAILURES =================================== 2025-12-04T11:59:23.4223233Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________ 2025-12-04T11:59:23.4223278Z Traceback (most recent call last): 2025-12-04T11:59:23.4223440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T11:59:23.4223483Z self._join_processes(fn) 2025-12-04T11:59:23.4223657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T11:59:23.4223713Z self._check_return_codes(fn, elapsed_time) 2025-12-04T11:59:23.4223893Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T11:59:23.4223936Z raise RuntimeError(error) 2025-12-04T11:59:23.4224014Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.4224077Z Traceback (most recent call last): 2025-12-04T11:59:23.4224239Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4224280Z getattr(self, test_name)() 2025-12-04T11:59:23.4224439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4224472Z fn() 2025-12-04T11:59:23.4224624Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4224664Z method(*args, **kwargs) 2025-12-04T11:59:23.4224818Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4224856Z method(*args, **kwargs) 2025-12-04T11:59:23.4225010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4225046Z with policy(): 2025-12-04T11:59:23.4225200Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4225239Z raise RuntimeError(msg) 2025-12-04T11:59:23.4225582Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4225605Z 2025-12-04T11:59:23.4225679Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4225903Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4225905Z 2025-12-04T11:59:23.4225994Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4225996Z 2025-12-04T11:59:23.4225998Z 2025-12-04T11:59:23.4226072Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T11:59:23.4226159Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T11:59:23.4226409Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-e8a73bea9050c490.xml - 2025-12-04T11:59:23.4226470Z =========================== short test summary info ============================ 2025-12-04T11:59:23.4226711Z FAILED [8.4184s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T11:59:23.4226755Z Traceback (most recent call last): 2025-12-04T11:59:23.4226922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T11:59:23.4226964Z getattr(self, test_name)() 2025-12-04T11:59:23.4227124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T11:59:23.4227158Z fn() 2025-12-04T11:59:23.4227310Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4227350Z method(*args, **kwargs) 2025-12-04T11:59:23.4227503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T11:59:23.4227542Z method(*args, **kwargs) 2025-12-04T11:59:23.4227694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T11:59:23.4227730Z with policy(): 2025-12-04T11:59:23.4227903Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T11:59:23.4227943Z raise RuntimeError(msg) 2025-12-04T11:59:23.4228289Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496. 2025-12-04T11:59:23.4228291Z 2025-12-04T11:59:23.4228364Z To execute this test, run the following from the base repo dir: 2025-12-04T11:59:23.4228588Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4228590Z 2025-12-04T11:59:23.4228676Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T11:59:23.4228739Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T11:59:23.4228800Z ======================= 1 failed, 7 deselected in 8.43s ======================== 2025-12-04T11:59:23.4228837Z Got exit code 1 2025-12-04T11:59:23.4229010Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda 2025-12-04T11:59:23.4229138Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T11:59:23.4229344Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-4f67c54b925ea078.xml 2025-12-04T11:59:23.4229424Z ============================= test session starts ============================== 2025-12-04T11:59:23.4229537Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T11:59:23.4229625Z cachedir: .pytest_cache 2025-12-04T11:59:23.4229788Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T11:59:23.4229832Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T11:59:23.4229872Z configfile: pytest.ini 2025-12-04T11:59:23.4230035Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T11:59:23.4230106Z collecting ... collected 8 items / 8 deselected / 0 selected 2025-12-04T11:59:23.4230157Z stepcurrent: skipping 8 already run items. 2025-12-04T11:59:23.4230200Z Running 0 items in this shard 2025-12-04T11:59:23.4230203Z 2025-12-04T11:59:23.4230454Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-4f67c54b925ea078.xml - 2025-12-04T11:59:23.4230512Z ============================ 8 deselected in 0.00s ============================= 2025-12-04T11:59:23.4232019Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda'] 2025-12-04T11:59:23.4232048Z 2025-12-04T11:59:23.4232252Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_exec_order 1/1 (test/test-reports/distributed.fsdp.test_fsdp_exec_order_1.1_b71b860b1a78e6ee_.log) 2025-12-04T11:59:23.4232254Z 2025-12-04T11:59:23.4232387Z Finished distributed/fsdp/test_fsdp_exec_order 1/1 ... [2025-12-04 11:59:23.325635][5227604.304670961], took 4.32min 2025-12-04T11:59:23.4232679Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T11:59:23.4232767Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T11:59:23.4232861Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T11:59:23.4232907Z Uploading artifacts took 0.00 seconds 2025-12-04T11:59:23.4232965Z distributed/fsdp/test_fsdp_exec_order 1/1 failed! 2025-12-04T11:59:23.4233079Z Running distributed/test_distributed_spawn 2/7 ... [2025-12-04 11:59:23.327918][5227604.306959872] 2025-12-04T11:59:23.4233144Z MPI not available -- MPI backend tests will be skipped 2025-12-04T11:59:23.4233226Z Running distributed tests for the test backend with env init_method 2025-12-04T11:59:23.4233273Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:59:23.4233615Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=2', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:23.328420] 2025-12-04T11:59:25.1902694Z 2025-12-04T11:59:25.1904012Z distributed/test_distributed_spawn 2/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_2.7_b13ac23383d709c6_.log 2025-12-04T11:59:25.1904959Z Running 0 items in this shard: 2025-12-04T11:59:25.1905197Z 2025-12-04T11:59:25.1908857Z Running distributed tests for the test backend with file init_method 2025-12-04T11:59:25.1909638Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:59:25.1912532Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=2', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:25.191092] 2025-12-04T11:59:27.0556709Z 2025-12-04T11:59:27.0557720Z distributed/test_distributed_spawn 2/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_2.7_0b90106487d58c16_.log 2025-12-04T11:59:27.0558483Z Running 0 items in this shard: 2025-12-04T11:59:27.0558677Z 2025-12-04T11:59:27.0563113Z Running distributed tests for the nccl backend with env init_method 2025-12-04T11:59:27.0564890Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T11:59:27.0566501Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=2', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:27.056470] 2025-12-04T12:02:28.2516980Z 2025-12-04T12:02:28.2518340Z distributed/test_distributed_spawn 2/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_2.7_f619207b10d6dd9a_.log 2025-12-04T12:02:28.2532449Z Running 41 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_half, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_SyncBatchNorm_process_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_average_parameters, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_same_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_remove_autograd_hooks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_destroy_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank_size_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager_param_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda_twice, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_torch_profiler 2025-12-04T12:02:28.2541593Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient 2025-12-04T12:02:28.2542257Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine 2025-12-04T12:02:28.2542830Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_half 2025-12-04T12:02:28.2543383Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad 2025-12-04T12:02:28.2543905Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_SyncBatchNorm_process_group 2025-12-04T12:02:28.2544419Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max 2025-12-04T12:02:28.2544951Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum 2025-12-04T12:02:28.2545436Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max 2025-12-04T12:02:28.2545827Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max 2025-12-04T12:02:28.2546203Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max 2025-12-04T12:02:28.2546574Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_product 2025-12-04T12:02:28.2546981Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max 2025-12-04T12:02:28.2547329Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_min 2025-12-04T12:02:28.2547676Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum 2025-12-04T12:02:28.2548045Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async 2025-12-04T12:02:28.2548455Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_complex 2025-12-04T12:02:28.2548889Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda_complex 2025-12-04T12:02:28.2549291Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_average_parameters 2025-12-04T12:02:28.2549685Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier 2025-12-04T12:02:28.2550037Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_nccl 2025-12-04T12:02:28.2550400Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast 2025-12-04T12:02:28.2550770Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_same_across_ranks 2025-12-04T12:02:28.2551167Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_remove_autograd_hooks 2025-12-04T12:02:28.2551533Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_destroy_group 2025-12-04T12:02:28.2551880Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank 2025-12-04T12:02:28.2552238Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank_size_full_group 2025-12-04T12:02:28.2552590Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv 2025-12-04T12:02:28.2553029Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size 2025-12-04T12:02:28.2553457Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager 2025-12-04T12:02:28.2553854Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager_param_group 2025-12-04T12:02:28.2554259Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_sum 2025-12-04T12:02:28.2554634Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda_twice 2025-12-04T12:02:28.2554996Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_checks 2025-12-04T12:02:28.2555343Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda_complex 2025-12-04T12:02:28.2555686Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_group 2025-12-04T12:02:28.2556046Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_object_list 2025-12-04T12:02:28.2556396Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source 2025-12-04T12:02:28.2556803Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_torch_profiler 2025-12-04T12:02:28.2557170Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag 2025-12-04T12:02:28.2557542Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2025-12-04T12:02:28.2557934Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_torch_profiler 2025-12-04T12:02:28.2558149Z 2025-12-04T12:02:28.2558238Z Running distributed tests for the nccl backend with file init_method 2025-12-04T12:02:28.2558411Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:02:28.2558840Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=2', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:02:28.253032] 2025-12-04T12:05:27.3885469Z 2025-12-04T12:05:27.3889192Z distributed/test_distributed_spawn 2/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_2.7_3465749de903e098_.log 2025-12-04T12:05:27.3901097Z Running 41 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_half, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_SyncBatchNorm_process_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_average_parameters, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_same_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_remove_autograd_hooks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_destroy_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank_size_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager_param_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda_twice, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_torch_profiler 2025-12-04T12:05:27.3909518Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient 2025-12-04T12:05:27.3910158Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine 2025-12-04T12:05:27.3910611Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_half 2025-12-04T12:05:27.3911051Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad 2025-12-04T12:05:27.3911471Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_SyncBatchNorm_process_group 2025-12-04T12:05:27.3911932Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max 2025-12-04T12:05:27.3912351Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum 2025-12-04T12:05:27.3912752Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max 2025-12-04T12:05:27.3913140Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max 2025-12-04T12:05:27.3913520Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max 2025-12-04T12:05:27.3913898Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_product 2025-12-04T12:05:27.3914268Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max 2025-12-04T12:05:27.3914620Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_min 2025-12-04T12:05:27.3914967Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum 2025-12-04T12:05:27.3915338Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async 2025-12-04T12:05:27.3915784Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_complex 2025-12-04T12:05:27.3916219Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda_complex 2025-12-04T12:05:27.3916629Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_average_parameters 2025-12-04T12:05:27.3916985Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier 2025-12-04T12:05:27.3917336Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_nccl 2025-12-04T12:05:27.3917696Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast 2025-12-04T12:05:27.3918081Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_same_across_ranks 2025-12-04T12:05:27.3918484Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_remove_autograd_hooks 2025-12-04T12:05:27.3918855Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_destroy_group 2025-12-04T12:05:27.3919205Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank 2025-12-04T12:05:27.3919564Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank_size_full_group 2025-12-04T12:05:27.3919982Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv 2025-12-04T12:05:27.3920365Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size 2025-12-04T12:05:27.3920770Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager 2025-12-04T12:05:27.3921147Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager_param_group 2025-12-04T12:05:27.3921523Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_sum 2025-12-04T12:05:27.3921932Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda_twice 2025-12-04T12:05:27.3922276Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_checks 2025-12-04T12:05:27.3922619Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda_complex 2025-12-04T12:05:27.3922957Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_group 2025-12-04T12:05:27.3923296Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_object_list 2025-12-04T12:05:27.3923646Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source 2025-12-04T12:05:27.3924024Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_torch_profiler 2025-12-04T12:05:27.3924396Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag 2025-12-04T12:05:27.3924764Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2025-12-04T12:05:27.3925160Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_torch_profiler 2025-12-04T12:05:27.3925408Z 2025-12-04T12:05:27.3925505Z Running distributed tests for the gloo backend with env init_method 2025-12-04T12:05:27.3925675Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:05:27.3926110Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=2', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:05:27.389301] 2025-12-04T12:08:51.1799157Z 2025-12-04T12:08:51.1800371Z distributed/test_distributed_spawn 2/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_2.7_3795574493b85cde_.log 2025-12-04T12:08:51.1813151Z Running 41 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_half, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_SyncBatchNorm_process_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_average_parameters, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_same_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_remove_autograd_hooks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_destroy_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank_size_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager_param_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda_twice, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_torch_profiler 2025-12-04T12:08:51.1822096Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient 2025-12-04T12:08:51.1822726Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine 2025-12-04T12:08:51.1823306Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_half 2025-12-04T12:08:51.1823873Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad 2025-12-04T12:08:51.1824400Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_SyncBatchNorm_process_group 2025-12-04T12:08:51.1824813Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max 2025-12-04T12:08:51.1825239Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum 2025-12-04T12:08:51.1825649Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max 2025-12-04T12:08:51.1826046Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max 2025-12-04T12:08:51.1826488Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max 2025-12-04T12:08:51.1826870Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_product 2025-12-04T12:08:51.1827246Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max 2025-12-04T12:08:51.1827601Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_min 2025-12-04T12:08:51.1827957Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum 2025-12-04T12:08:51.1828327Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async 2025-12-04T12:08:51.1828748Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_complex 2025-12-04T12:08:51.1829191Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda_complex 2025-12-04T12:08:51.1829654Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_average_parameters 2025-12-04T12:08:51.1830008Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier 2025-12-04T12:08:51.1830414Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_nccl 2025-12-04T12:08:51.1830780Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast 2025-12-04T12:08:51.1831159Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_same_across_ranks 2025-12-04T12:08:51.1831569Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_remove_autograd_hooks 2025-12-04T12:08:51.1831944Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_destroy_group 2025-12-04T12:08:51.1832293Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank 2025-12-04T12:08:51.1832658Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank_size_full_group 2025-12-04T12:08:51.1833021Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv 2025-12-04T12:08:51.1833425Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size 2025-12-04T12:08:51.1833859Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager 2025-12-04T12:08:51.1834267Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager_param_group 2025-12-04T12:08:51.1834651Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_sum 2025-12-04T12:08:51.1835002Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda_twice 2025-12-04T12:08:51.1835345Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_checks 2025-12-04T12:08:51.1835683Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda_complex 2025-12-04T12:08:51.1836022Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_group 2025-12-04T12:08:51.1836396Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_object_list 2025-12-04T12:08:51.1836741Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source 2025-12-04T12:08:51.1837110Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_torch_profiler 2025-12-04T12:08:51.1837476Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag 2025-12-04T12:08:51.1837845Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2025-12-04T12:08:51.1838232Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_torch_profiler 2025-12-04T12:08:51.1838445Z 2025-12-04T12:08:51.1838536Z Running distributed tests for the gloo backend with file init_method 2025-12-04T12:08:51.1838708Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:08:51.1839138Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=2', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:08:51.180753] 2025-12-04T12:12:14.4865053Z 2025-12-04T12:12:14.4866222Z distributed/test_distributed_spawn 2/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_2.7_e5fa07b133880a43_.log 2025-12-04T12:12:14.4872977Z Running 41 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_half, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_SyncBatchNorm_process_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_average_parameters, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_same_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_remove_autograd_hooks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_destroy_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank_size_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager_param_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda_twice, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_torch_profiler 2025-12-04T12:12:14.4881505Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_gradient 2025-12-04T12:12:14.4881967Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_No_Affine 2025-12-04T12:12:14.4882391Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_half 2025-12-04T12:12:14.4882813Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_requires_grad 2025-12-04T12:12:14.4883210Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_SyncBatchNorm_process_group 2025-12-04T12:12:14.4883597Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_max 2025-12-04T12:12:14.4884008Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_sum 2025-12-04T12:12:14.4884386Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max 2025-12-04T12:12:14.4884752Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_max 2025-12-04T12:12:14.4885115Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_max 2025-12-04T12:12:14.4885470Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_group_product 2025-12-04T12:12:14.4885818Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_max 2025-12-04T12:12:14.4886150Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_min 2025-12-04T12:12:14.4886560Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum 2025-12-04T12:12:14.4886908Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda_async 2025-12-04T12:12:14.4887295Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_complex 2025-12-04T12:12:14.4887717Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda_complex 2025-12-04T12:12:14.4888105Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_average_parameters 2025-12-04T12:12:14.4888439Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier 2025-12-04T12:12:14.4888776Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_nccl 2025-12-04T12:12:14.4889118Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast 2025-12-04T12:12:14.4889472Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_same_across_ranks 2025-12-04T12:12:14.4889879Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_remove_autograd_hooks 2025-12-04T12:12:14.4890253Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_destroy_group 2025-12-04T12:12:14.4890577Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank 2025-12-04T12:12:14.4890913Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_get_rank_size_full_group 2025-12-04T12:12:14.4891321Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_irecv 2025-12-04T12:12:14.4891699Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_world_size_not_divisible_by_group_size 2025-12-04T12:12:14.4892096Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager 2025-12-04T12:12:14.4892472Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_periodic_model_averager_param_group 2025-12-04T12:12:14.4892847Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_sum 2025-12-04T12:12:14.4893204Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_cuda_twice 2025-12-04T12:12:14.4893548Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_checks 2025-12-04T12:12:14.4893885Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_cuda_complex 2025-12-04T12:12:14.4894222Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_group 2025-12-04T12:12:14.4894560Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_object_list 2025-12-04T12:12:14.4894905Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source 2025-12-04T12:12:14.4895274Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_torch_profiler 2025-12-04T12:12:14.4895640Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag 2025-12-04T12:12:14.4896054Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_autograd_profiler 2025-12-04T12:12:14.4896444Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_with_tag_torch_profiler 2025-12-04T12:12:14.4896655Z 2025-12-04T12:12:14.4896789Z Finished distributed/test_distributed_spawn 2/7 ... [2025-12-04 12:12:14.487035][5228375.466071689], took 12.85min 2025-12-04T12:12:14.4897240Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:12:14.4897637Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:12:14.4897874Z Running distributed/test_distributed_spawn 5/7 ... [2025-12-04 12:12:14.489446][5228375.468488448] 2025-12-04T12:12:14.4898099Z MPI not available -- MPI backend tests will be skipped 2025-12-04T12:12:14.4898285Z Running distributed tests for the test backend with env init_method 2025-12-04T12:12:14.4898459Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:12:14.4899369Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=5', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:12:14.489840] 2025-12-04T12:12:16.3505707Z 2025-12-04T12:12:16.3506652Z distributed/test_distributed_spawn 5/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_5.7_ac3e1d0a6320b1a9_.log 2025-12-04T12:12:16.3507558Z Running 0 items in this shard: 2025-12-04T12:12:16.3507772Z 2025-12-04T12:12:16.3512943Z Running distributed tests for the test backend with file init_method 2025-12-04T12:12:16.3515309Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:12:16.3517272Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=5', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:12:16.351544] 2025-12-04T12:12:18.2100014Z 2025-12-04T12:12:18.2100906Z distributed/test_distributed_spawn 5/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_5.7_77501df2042f0bce_.log 2025-12-04T12:12:18.2101651Z Running 0 items in this shard: 2025-12-04T12:12:18.2101835Z 2025-12-04T12:12:18.2106122Z Running distributed tests for the nccl backend with env init_method 2025-12-04T12:12:18.2108515Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:12:18.2110325Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=5', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:12:18.210860] 2025-12-04T12:16:19.3652141Z 2025-12-04T12:16:19.3653199Z distributed/test_distributed_spawn 5/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_5.7_dc8d67f6dc9d3408_.log 2025-12-04T12:16:19.3666451Z Running 49 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedSampler_padding, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_complex_unsupported_ops, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_global, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_self_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_apply_optim_in_backward_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_allreduce_process_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_returns_tensor_with_no_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_dump_DDP_relevant_env_vars, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_group_size_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_twice, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_sparse_all_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_stateless_api_with_ddp, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2025-12-04T12:16:19.3676064Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value 2025-12-04T12:16:19.3676645Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process 2025-12-04T12:16:19.3677151Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedSampler_padding 2025-12-04T12:16:19.3677570Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather 2025-12-04T12:16:19.3677965Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda 2025-12-04T12:16:19.3678392Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg 2025-12-04T12:16:19.3678843Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product 2025-12-04T12:16:19.3679282Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min 2025-12-04T12:16:19.3679707Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_product 2025-12-04T12:16:19.3680128Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_sum 2025-12-04T12:16:19.3680509Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_complex_unsupported_ops 2025-12-04T12:16:19.3680887Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_complex 2025-12-04T12:16:19.3681242Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda 2025-12-04T12:16:19.3681590Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group 2025-12-04T12:16:19.3681963Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_complex 2025-12-04T12:16:19.3682375Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda 2025-12-04T12:16:19.3682755Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_global 2025-12-04T12:16:19.3683119Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags 2025-12-04T12:16:19.3683501Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err 2025-12-04T12:16:19.3683887Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_self_nccl 2025-12-04T12:16:19.3684286Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_apply_optim_in_backward_ignored_params 2025-12-04T12:16:19.3684719Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer 2025-12-04T12:16:19.3685105Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_allreduce_process_group 2025-12-04T12:16:19.3685489Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg 2025-12-04T12:16:19.3685842Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu 2025-12-04T12:16:19.3686218Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_autograd_profiler 2025-12-04T12:16:19.3686602Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace 2025-12-04T12:16:19.3686972Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged 2025-12-04T12:16:19.3687350Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_returns_tensor_with_no_grad 2025-12-04T12:16:19.3687742Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params 2025-12-04T12:16:19.3688129Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable 2025-12-04T12:16:19.3688518Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_dump_DDP_relevant_env_vars 2025-12-04T12:16:19.3688876Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group 2025-12-04T12:16:19.3689218Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object 2025-12-04T12:16:19.3689626Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo 2025-12-04T12:16:19.3690016Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_group_size_exceeds_world_size 2025-12-04T12:16:19.3690424Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module 2025-12-04T12:16:19.3690861Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view 2025-12-04T12:16:19.3691279Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_max 2025-12-04T12:16:19.3691620Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_max 2025-12-04T12:16:19.3691957Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product 2025-12-04T12:16:19.3692295Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_twice 2025-12-04T12:16:19.3692638Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_full_group 2025-12-04T12:16:19.3692976Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv 2025-12-04T12:16:19.3693340Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_autograd_profiler 2025-12-04T12:16:19.3693730Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler 2025-12-04T12:16:19.3694109Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_sparse_all_reduce_sum_cuda 2025-12-04T12:16:19.3694508Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_stateless_api_with_ddp 2025-12-04T12:16:19.3694890Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2025-12-04T12:16:19.3695111Z 2025-12-04T12:16:19.3695203Z Running distributed tests for the nccl backend with file init_method 2025-12-04T12:16:19.3695375Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:16:19.3695806Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=5', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:16:19.366119] 2025-12-04T12:20:22.2374264Z 2025-12-04T12:20:22.2375426Z distributed/test_distributed_spawn 5/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_5.7_826092b9f65b8986_.log 2025-12-04T12:20:22.2389967Z Running 49 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedSampler_padding, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_complex_unsupported_ops, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_global, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_self_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_apply_optim_in_backward_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_allreduce_process_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_returns_tensor_with_no_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_dump_DDP_relevant_env_vars, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_group_size_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_twice, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_sparse_all_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_stateless_api_with_ddp, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2025-12-04T12:20:22.2399918Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value 2025-12-04T12:20:22.2400443Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process 2025-12-04T12:20:22.2400907Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedSampler_padding 2025-12-04T12:20:22.2401292Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather 2025-12-04T12:20:22.2401652Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda 2025-12-04T12:20:22.2402030Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg 2025-12-04T12:20:22.2402448Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product 2025-12-04T12:20:22.2402872Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min 2025-12-04T12:20:22.2403288Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_product 2025-12-04T12:20:22.2403693Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_sum 2025-12-04T12:20:22.2404095Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_complex_unsupported_ops 2025-12-04T12:20:22.2404548Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_complex 2025-12-04T12:20:22.2404934Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda 2025-12-04T12:20:22.2405289Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group 2025-12-04T12:20:22.2405680Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_complex 2025-12-04T12:20:22.2406098Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda 2025-12-04T12:20:22.2406495Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_global 2025-12-04T12:20:22.2406881Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags 2025-12-04T12:20:22.2407279Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err 2025-12-04T12:20:22.2407680Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_self_nccl 2025-12-04T12:20:22.2408090Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_apply_optim_in_backward_ignored_params 2025-12-04T12:20:22.2408520Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer 2025-12-04T12:20:22.2408916Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_allreduce_process_group 2025-12-04T12:20:22.2409322Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg 2025-12-04T12:20:22.2409723Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu 2025-12-04T12:20:22.2410094Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_autograd_profiler 2025-12-04T12:20:22.2410480Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace 2025-12-04T12:20:22.2410879Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged 2025-12-04T12:20:22.2411252Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_returns_tensor_with_no_grad 2025-12-04T12:20:22.2411643Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params 2025-12-04T12:20:22.2412028Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable 2025-12-04T12:20:22.2412402Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_dump_DDP_relevant_env_vars 2025-12-04T12:20:22.2412761Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group 2025-12-04T12:20:22.2413101Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object 2025-12-04T12:20:22.2413444Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo 2025-12-04T12:20:22.2413830Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_group_size_exceeds_world_size 2025-12-04T12:20:22.2414273Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module 2025-12-04T12:20:22.2414709Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view 2025-12-04T12:20:22.2415125Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_max 2025-12-04T12:20:22.2415463Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_max 2025-12-04T12:20:22.2415789Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product 2025-12-04T12:20:22.2416123Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_twice 2025-12-04T12:20:22.2416469Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_full_group 2025-12-04T12:20:22.2416806Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv 2025-12-04T12:20:22.2417166Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_autograd_profiler 2025-12-04T12:20:22.2417558Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler 2025-12-04T12:20:22.2418031Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_sparse_all_reduce_sum_cuda 2025-12-04T12:20:22.2418392Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_stateless_api_with_ddp 2025-12-04T12:20:22.2418795Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2025-12-04T12:20:22.2419023Z 2025-12-04T12:20:22.2419109Z Running distributed tests for the gloo backend with env init_method 2025-12-04T12:20:22.2419279Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:20:22.2419746Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=5', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:20:22.238291] 2025-12-04T12:24:27.6468820Z 2025-12-04T12:24:27.6470240Z distributed/test_distributed_spawn 5/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_5.7_13c969534f9043ac_.log 2025-12-04T12:24:27.6486846Z Running 49 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedSampler_padding, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_complex_unsupported_ops, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_global, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_self_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_apply_optim_in_backward_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_allreduce_process_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_returns_tensor_with_no_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_dump_DDP_relevant_env_vars, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_group_size_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_twice, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_sparse_all_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_stateless_api_with_ddp, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2025-12-04T12:24:27.6496710Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value 2025-12-04T12:24:27.6497281Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process 2025-12-04T12:24:27.6497778Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedSampler_padding 2025-12-04T12:24:27.6498182Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather 2025-12-04T12:24:27.6498562Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda 2025-12-04T12:24:27.6498974Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg 2025-12-04T12:24:27.6499431Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product 2025-12-04T12:24:27.6499921Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min 2025-12-04T12:24:27.6500320Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_product 2025-12-04T12:24:27.6500694Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_sum 2025-12-04T12:24:27.6501133Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_complex_unsupported_ops 2025-12-04T12:24:27.6501512Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_complex 2025-12-04T12:24:27.6501893Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda 2025-12-04T12:24:27.6502234Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group 2025-12-04T12:24:27.6502602Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_complex 2025-12-04T12:24:27.6503003Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda 2025-12-04T12:24:27.6503380Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_global 2025-12-04T12:24:27.6503748Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags 2025-12-04T12:24:27.6504131Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err 2025-12-04T12:24:27.6504515Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_self_nccl 2025-12-04T12:24:27.6504907Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_apply_optim_in_backward_ignored_params 2025-12-04T12:24:27.6505297Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer 2025-12-04T12:24:27.6505677Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_allreduce_process_group 2025-12-04T12:24:27.6506055Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg 2025-12-04T12:24:27.6506410Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu 2025-12-04T12:24:27.6506816Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_autograd_profiler 2025-12-04T12:24:27.6507200Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace 2025-12-04T12:24:27.6507567Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged 2025-12-04T12:24:27.6507940Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_returns_tensor_with_no_grad 2025-12-04T12:24:27.6508324Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params 2025-12-04T12:24:27.6508704Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable 2025-12-04T12:24:27.6509081Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_dump_DDP_relevant_env_vars 2025-12-04T12:24:27.6509438Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group 2025-12-04T12:24:27.6509814Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object 2025-12-04T12:24:27.6510158Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo 2025-12-04T12:24:27.6510565Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_group_size_exceeds_world_size 2025-12-04T12:24:27.6510973Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module 2025-12-04T12:24:27.6511427Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view 2025-12-04T12:24:27.6511842Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_max 2025-12-04T12:24:27.6512178Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_max 2025-12-04T12:24:27.6512508Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product 2025-12-04T12:24:27.6512849Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_twice 2025-12-04T12:24:27.6513190Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_full_group 2025-12-04T12:24:27.6513523Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv 2025-12-04T12:24:27.6513888Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_autograd_profiler 2025-12-04T12:24:27.6514278Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler 2025-12-04T12:24:27.6514651Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_sparse_all_reduce_sum_cuda 2025-12-04T12:24:27.6515016Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_stateless_api_with_ddp 2025-12-04T12:24:27.6515399Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2025-12-04T12:24:27.6515619Z 2025-12-04T12:24:27.6515707Z Running distributed tests for the gloo backend with file init_method 2025-12-04T12:24:27.6515972Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:24:27.6516496Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=5', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:24:27.648439] 2025-12-04T12:28:33.3971719Z 2025-12-04T12:28:33.3972828Z distributed/test_distributed_spawn 5/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_5.7_3f8f9b2ee17a8cb7_.log 2025-12-04T12:28:33.3981221Z Running 49 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedSampler_padding, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_sum, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_complex_unsupported_ops, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_global, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_self_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_apply_optim_in_backward_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_allreduce_process_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_returns_tensor_with_no_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_dump_DDP_relevant_env_vars, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_group_size_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_twice, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_sparse_all_reduce_sum_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_stateless_api_with_ddp, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2025-12-04T12:28:33.3988986Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Diff_Input_Sizes_Running_Value 2025-12-04T12:28:33.3989521Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Single_Input_Per_Process 2025-12-04T12:28:33.3990055Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedSampler_padding 2025-12-04T12:28:33.3990458Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather 2025-12-04T12:28:33.3990826Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_cuda 2025-12-04T12:28:33.3991217Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_default_pg 2025-12-04T12:28:33.3991679Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_full_group_product 2025-12-04T12:28:33.3992308Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_min 2025-12-04T12:28:33.3992742Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_group_product 2025-12-04T12:28:33.3993145Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_sum 2025-12-04T12:28:33.3993548Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_complex_unsupported_ops 2025-12-04T12:28:33.3993963Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_complex 2025-12-04T12:28:33.3994345Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_cuda 2025-12-04T12:28:33.3994730Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_group 2025-12-04T12:28:33.3995181Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_complex 2025-12-04T12:28:33.3995609Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_cuda 2025-12-04T12:28:33.3996026Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_global 2025-12-04T12:28:33.3996818Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_gloo_tags 2025-12-04T12:28:33.3997224Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_mixed_backend_err 2025-12-04T12:28:33.3997650Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_self_nccl 2025-12-04T12:28:33.3998073Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_apply_optim_in_backward_ignored_params 2025-12-04T12:28:33.3998495Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer 2025-12-04T12:28:33.3998911Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_parity_allreduce_process_group 2025-12-04T12:28:33.3999319Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_ignore_params_arg 2025-12-04T12:28:33.3999836Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_logging_data_cpu 2025-12-04T12:28:33.4000231Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_autograd_profiler 2025-12-04T12:28:33.4000646Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_execution_trace 2025-12-04T12:28:33.4001081Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_python_error_logged 2025-12-04T12:28:33.4001476Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_returns_tensor_with_no_grad 2025-12-04T12:28:33.4001906Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_shared_grad_acc_unused_params 2025-12-04T12:28:33.4002321Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_input_join_disable 2025-12-04T12:28:33.4002720Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_dump_DDP_relevant_env_vars 2025-12-04T12:28:33.4003119Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_full_group 2025-12-04T12:28:33.4003487Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object 2025-12-04T12:28:33.4003873Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo 2025-12-04T12:28:33.4004287Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_group_size_exceeds_world_size 2025-12-04T12:28:33.4004732Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_tuple_module 2025-12-04T12:28:33.4005966Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity_with_hierarchical_sgd_grad_is_view 2025-12-04T12:28:33.4006406Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_max 2025-12-04T12:28:33.4006813Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_max 2025-12-04T12:28:33.4007188Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_product 2025-12-04T12:28:33.4007555Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_sum_twice 2025-12-04T12:28:33.4007927Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_scatter_full_group 2025-12-04T12:28:33.4008297Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv 2025-12-04T12:28:33.4008691Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_any_source_autograd_profiler 2025-12-04T12:28:33.4009129Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl_torch_profiler 2025-12-04T12:28:33.4009546Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_sparse_all_reduce_sum_cuda 2025-12-04T12:28:33.4009981Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_stateless_api_with_ddp 2025-12-04T12:28:33.4010398Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_verify_model_across_rank_without_logger 2025-12-04T12:28:33.4010655Z 2025-12-04T12:28:33.4010808Z Finished distributed/test_distributed_spawn 5/7 ... [2025-12-04 12:28:33.397987][5229354.377023565], took 16.32min 2025-12-04T12:28:33.4011299Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:28:33.4011738Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:28:33.4012022Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:28:33.4021622Z Uploading artifacts took 0.00 seconds 2025-12-04T12:28:33.4021836Z Running distributed/fsdp/test_fsdp_input 1/1 ... [2025-12-04 12:28:33.400657][5229354.3796988] 2025-12-04T12:28:33.4022074Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:28:33.4022491Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_input.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:28:33.400843] 2025-12-04T12:29:30.3417249Z 2025-12-04T12:29:30.3418187Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_input 1/1 (test/test-reports/distributed.fsdp.test_fsdp_input_1.1_bc379566c9ef67b0_.log) 2025-12-04T12:29:30.3419556Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-e478acc36cfea895.xml 2025-12-04T12:29:30.3420582Z ============================= test session starts ============================== 2025-12-04T12:29:30.3421224Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:30.3421752Z cachedir: .pytest_cache 2025-12-04T12:29:30.3423630Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:30.3424311Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:30.3424642Z configfile: pytest.ini 2025-12-04T12:29:30.3425326Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:30.3425978Z collecting ... collected 2 items 2025-12-04T12:29:30.3426243Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:29:30.3427623Z Running 2 items in this shard: test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda, test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda 2025-12-04T12:29:30.3428227Z 2025-12-04T12:29:30.3428803Z distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda I1204 12:28:35.242000 317063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 317132 2025-12-04T12:29:30.3430032Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T12:29:30.3430798Z _init_core_state( 2025-12-04T12:29:30.3433382Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:29:30.3436188Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:29:30.3436618Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:30.3437238Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:30.3437941Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3438615Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:30.3439292Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3439989Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:30.3440615Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3441275Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3441931Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3442578Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3443230Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3443940Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:30.3444585Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3445234Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:30.3446108Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3446975Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3447367Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3447968Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3448500Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3449016Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3449474Z [rank0]:E1204 12:28:40.445000 317132 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:30.3449815Z dist init r=0, world=1 2025-12-04T12:29:30.3450273Z [rank0]:[W1204 12:28:40.607400974 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:30.3450719Z FAILED [6.7122s] [ 50%] 2025-12-04T12:29:30.3450796Z 2025-12-04T12:29:30.3450859Z =================================== FAILURES =================================== 2025-12-04T12:29:30.3451061Z ___________________ TestInputCUDA.test_input_type_dict_cuda ____________________ 2025-12-04T12:29:30.3451249Z Traceback (most recent call last): 2025-12-04T12:29:30.3451524Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:30.3451794Z self._join_processes(fn) 2025-12-04T12:29:30.3452075Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:30.3452372Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:30.3452666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:30.3452959Z raise RuntimeError(error) 2025-12-04T12:29:30.3453125Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3453304Z Traceback (most recent call last): 2025-12-04T12:29:30.3453570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3453838Z getattr(self, test_name)() 2025-12-04T12:29:30.3454100Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3454365Z fn() 2025-12-04T12:29:30.3454641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3454900Z method(*args, **kwargs) 2025-12-04T12:29:30.3455148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3455402Z method(*args, **kwargs) 2025-12-04T12:29:30.3455650Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3455902Z with policy(): 2025-12-04T12:29:30.3456134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3456390Z raise RuntimeError(msg) 2025-12-04T12:29:30.3456791Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3457136Z 2025-12-04T12:29:30.3457210Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3457512Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3457735Z 2025-12-04T12:29:30.3457825Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3457952Z 2025-12-04T12:29:30.3457953Z 2025-12-04T12:29:30.3458052Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:30.3458253Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:30.3458623Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-e478acc36cfea895.xml - 2025-12-04T12:29:30.3458977Z =========================== short test summary info ============================ 2025-12-04T12:29:30.3459283Z FAILED [6.7122s] distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3459616Z Traceback (most recent call last): 2025-12-04T12:29:30.3459864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3460114Z getattr(self, test_name)() 2025-12-04T12:29:30.3460347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3460586Z fn() 2025-12-04T12:29:30.3460792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3461021Z method(*args, **kwargs) 2025-12-04T12:29:30.3461241Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3461470Z method(*args, **kwargs) 2025-12-04T12:29:30.3461686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3461911Z with policy(): 2025-12-04T12:29:30.3462122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3462352Z raise RuntimeError(msg) 2025-12-04T12:29:30.3462729Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3463065Z 2025-12-04T12:29:30.3463142Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3463473Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3463692Z 2025-12-04T12:29:30.3463778Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3463965Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:30.3464121Z ============================== 1 failed in 6.72s =============================== 2025-12-04T12:29:30.3464253Z Got exit code 1 2025-12-04T12:29:30.3464349Z Retrying single test... 2025-12-04T12:29:30.3464609Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-22b8fd4eb01f818e.xml 2025-12-04T12:29:30.3464891Z ============================= test session starts ============================== 2025-12-04T12:29:30.3465099Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:30.3465289Z cachedir: .pytest_cache 2025-12-04T12:29:30.3465514Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:30.3465752Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:30.3465868Z configfile: pytest.ini 2025-12-04T12:29:30.3466096Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:30.3466364Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T12:29:30.3466648Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda 2025-12-04T12:29:30.3466938Z Running 1 items in this shard 2025-12-04T12:29:30.3467012Z 2025-12-04T12:29:30.3467276Z distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda I1204 12:28:44.460000 317215 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 317284 2025-12-04T12:29:30.3467893Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T12:29:30.3468260Z _init_core_state( 2025-12-04T12:29:30.3469669Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:29:30.3471088Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:29:30.3471394Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:30.3471741Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:30.3472240Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3472752Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:30.3473230Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3473673Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:30.3474111Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3474571Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3475033Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3475498Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3475964Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3476429Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:30.3476886Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3477366Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:30.3477982Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3478558Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3478907Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3479455Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3479955Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3480321Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3480732Z [rank0]:E1204 12:28:49.624000 317284 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:30.3480973Z dist init r=0, world=1 2025-12-04T12:29:30.3481372Z [rank0]:[W1204 12:28:49.781038386 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:30.3481783Z FAILED [6.7126s] [100%] 2025-12-04T12:29:30.3481848Z 2025-12-04T12:29:30.3481936Z =================================== FAILURES =================================== 2025-12-04T12:29:30.3482134Z ___________________ TestInputCUDA.test_input_type_dict_cuda ____________________ 2025-12-04T12:29:30.3482307Z Traceback (most recent call last): 2025-12-04T12:29:30.3482559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:30.3482811Z self._join_processes(fn) 2025-12-04T12:29:30.3483066Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:30.3483337Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:30.3483612Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:30.3483880Z raise RuntimeError(error) 2025-12-04T12:29:30.3484039Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3484206Z Traceback (most recent call last): 2025-12-04T12:29:30.3484453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3484702Z getattr(self, test_name)() 2025-12-04T12:29:30.3484941Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3485199Z fn() 2025-12-04T12:29:30.3485408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3485643Z method(*args, **kwargs) 2025-12-04T12:29:30.3485870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3486124Z method(*args, **kwargs) 2025-12-04T12:29:30.3486354Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3486585Z with policy(): 2025-12-04T12:29:30.3486802Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3487039Z raise RuntimeError(msg) 2025-12-04T12:29:30.3487416Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3487761Z 2025-12-04T12:29:30.3487842Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3488143Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3488366Z 2025-12-04T12:29:30.3488467Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3488593Z 2025-12-04T12:29:30.3488594Z 2025-12-04T12:29:30.3488679Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:30.3488884Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:30.3489251Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-22b8fd4eb01f818e.xml - 2025-12-04T12:29:30.3489623Z =========================== short test summary info ============================ 2025-12-04T12:29:30.3489927Z FAILED [6.7126s] distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3490213Z Traceback (most recent call last): 2025-12-04T12:29:30.3490461Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3490743Z getattr(self, test_name)() 2025-12-04T12:29:30.3490982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3491217Z fn() 2025-12-04T12:29:30.3491424Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3491659Z method(*args, **kwargs) 2025-12-04T12:29:30.3491886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3492120Z method(*args, **kwargs) 2025-12-04T12:29:30.3492343Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3492575Z with policy(): 2025-12-04T12:29:30.3492795Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3493034Z raise RuntimeError(msg) 2025-12-04T12:29:30.3493410Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3493748Z 2025-12-04T12:29:30.3493827Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3494126Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3494369Z 2025-12-04T12:29:30.3494460Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3494654Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:30.3494841Z ======================= 1 failed, 1 deselected in 6.72s ======================== 2025-12-04T12:29:30.3494987Z Got exit code 1 2025-12-04T12:29:30.3495090Z Retrying single test... 2025-12-04T12:29:30.3495356Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-c34eff7f3645e485.xml 2025-12-04T12:29:30.3495646Z ============================= test session starts ============================== 2025-12-04T12:29:30.3495862Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:30.3496060Z cachedir: .pytest_cache 2025-12-04T12:29:30.3496291Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:30.3496537Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:30.3496663Z configfile: pytest.ini 2025-12-04T12:29:30.3496893Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:30.3497171Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T12:29:30.3497479Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda 2025-12-04T12:29:30.3497740Z Running 1 items in this shard 2025-12-04T12:29:30.3497818Z 2025-12-04T12:29:30.3498087Z distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda I1204 12:28:53.545000 317367 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 317436 2025-12-04T12:29:30.3498697Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T12:29:30.3499075Z _init_core_state( 2025-12-04T12:29:30.3500513Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:29:30.3501947Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:29:30.3502259Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:30.3502602Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:30.3503097Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3503601Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:30.3504086Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3504580Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:30.3505025Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3505492Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3505962Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3506434Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3506905Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3507358Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:30.3507816Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3508286Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:30.3508935Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3509518Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3509950Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3510497Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3510963Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3511334Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3511756Z [rank0]:E1204 12:28:58.730000 317436 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:30.3512005Z dist init r=0, world=1 2025-12-04T12:29:30.3512408Z [rank0]:[W1204 12:28:58.892768481 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:30.3512842Z FAILED [6.7122s] [100%] 2025-12-04T12:29:30.3512906Z 2025-12-04T12:29:30.3512967Z =================================== FAILURES =================================== 2025-12-04T12:29:30.3513149Z ___________________ TestInputCUDA.test_input_type_dict_cuda ____________________ 2025-12-04T12:29:30.3513318Z Traceback (most recent call last): 2025-12-04T12:29:30.3513587Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:30.3513836Z self._join_processes(fn) 2025-12-04T12:29:30.3514089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:30.3514361Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:30.3514635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:30.3514905Z raise RuntimeError(error) 2025-12-04T12:29:30.3515063Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3515227Z Traceback (most recent call last): 2025-12-04T12:29:30.3515474Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3515721Z getattr(self, test_name)() 2025-12-04T12:29:30.3515963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3516201Z fn() 2025-12-04T12:29:30.3516409Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3516646Z method(*args, **kwargs) 2025-12-04T12:29:30.3516875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3517112Z method(*args, **kwargs) 2025-12-04T12:29:30.3517336Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3517573Z with policy(): 2025-12-04T12:29:30.3517789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3518032Z raise RuntimeError(msg) 2025-12-04T12:29:30.3518442Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3518782Z 2025-12-04T12:29:30.3518862Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3519164Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3519386Z 2025-12-04T12:29:30.3519480Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3519661Z 2025-12-04T12:29:30.3519663Z 2025-12-04T12:29:30.3519743Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:30.3519947Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:30.3520313Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-c34eff7f3645e485.xml - 2025-12-04T12:29:30.3520649Z =========================== short test summary info ============================ 2025-12-04T12:29:30.3520951Z FAILED [6.7122s] distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3521234Z Traceback (most recent call last): 2025-12-04T12:29:30.3521488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3521760Z getattr(self, test_name)() 2025-12-04T12:29:30.3521999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3522239Z fn() 2025-12-04T12:29:30.3522447Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3522705Z method(*args, **kwargs) 2025-12-04T12:29:30.3522934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3523172Z method(*args, **kwargs) 2025-12-04T12:29:30.3523399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3523634Z with policy(): 2025-12-04T12:29:30.3523853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3524091Z raise RuntimeError(msg) 2025-12-04T12:29:30.3524467Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3524812Z 2025-12-04T12:29:30.3524889Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3525191Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_dict_cuda 2025-12-04T12:29:30.3525414Z 2025-12-04T12:29:30.3525504Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3525697Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:30.3525869Z ======================= 1 failed, 1 deselected in 6.72s ======================== 2025-12-04T12:29:30.3526011Z Got exit code 1 2025-12-04T12:29:30.3526209Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda 2025-12-04T12:29:30.3526514Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:29:30.3526913Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-6433fd4e4679d804.xml 2025-12-04T12:29:30.3527204Z ============================= test session starts ============================== 2025-12-04T12:29:30.3527420Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:30.3527616Z cachedir: .pytest_cache 2025-12-04T12:29:30.3527846Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:30.3528094Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:30.3528219Z configfile: pytest.ini 2025-12-04T12:29:30.3528456Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:30.3528732Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T12:29:30.3528898Z stepcurrent: skipping 1 already run items. 2025-12-04T12:29:30.3529034Z Running 1 items in this shard 2025-12-04T12:29:30.3529112Z 2025-12-04T12:29:30.3529385Z distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda I1204 12:29:02.597000 317519 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 317588 2025-12-04T12:29:30.3530023Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T12:29:30.3530419Z _init_core_state( 2025-12-04T12:29:30.3531758Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:29:30.3533188Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:29:30.3533497Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:30.3533842Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:30.3534336Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3534819Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:30.3535302Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3535758Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:30.3536238Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3536706Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3537168Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3537628Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3538093Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3538540Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:30.3538998Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3539460Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:30.3540116Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3540710Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3541075Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3541618Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3542078Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3542446Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3542859Z [rank0]:E1204 12:29:07.825000 317588 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:30.3543100Z dist init r=0, world=1 2025-12-04T12:29:30.3543500Z [rank0]:[W1204 12:29:07.976226866 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:30.3543908Z FAILED [6.8121s] [100%] 2025-12-04T12:29:30.3543970Z 2025-12-04T12:29:30.3544027Z =================================== FAILURES =================================== 2025-12-04T12:29:30.3544203Z ___________________ TestInputCUDA.test_input_type_list_cuda ____________________ 2025-12-04T12:29:30.3544367Z Traceback (most recent call last): 2025-12-04T12:29:30.3544609Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:30.3544854Z self._join_processes(fn) 2025-12-04T12:29:30.3545098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:30.3545392Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:30.3545662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:30.3545921Z raise RuntimeError(error) 2025-12-04T12:29:30.3546069Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3546227Z Traceback (most recent call last): 2025-12-04T12:29:30.3546466Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3546707Z getattr(self, test_name)() 2025-12-04T12:29:30.3546939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3547170Z fn() 2025-12-04T12:29:30.3547370Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3547599Z method(*args, **kwargs) 2025-12-04T12:29:30.3547820Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3548049Z method(*args, **kwargs) 2025-12-04T12:29:30.3548266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3548490Z with policy(): 2025-12-04T12:29:30.3548700Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3548944Z raise RuntimeError(msg) 2025-12-04T12:29:30.3549313Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3549702Z 2025-12-04T12:29:30.3549780Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3550074Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3550293Z 2025-12-04T12:29:30.3550382Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3550506Z 2025-12-04T12:29:30.3550511Z 2025-12-04T12:29:30.3550587Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:30.3550786Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:30.3551147Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-6433fd4e4679d804.xml - 2025-12-04T12:29:30.3551476Z =========================== short test summary info ============================ 2025-12-04T12:29:30.3551778Z FAILED [6.8121s] distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3552054Z Traceback (most recent call last): 2025-12-04T12:29:30.3552299Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3552543Z getattr(self, test_name)() 2025-12-04T12:29:30.3552779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3553009Z fn() 2025-12-04T12:29:30.3553212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3553439Z method(*args, **kwargs) 2025-12-04T12:29:30.3553657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3553886Z method(*args, **kwargs) 2025-12-04T12:29:30.3554143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3554368Z with policy(): 2025-12-04T12:29:30.3554581Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3554812Z raise RuntimeError(msg) 2025-12-04T12:29:30.3555181Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3555519Z 2025-12-04T12:29:30.3555593Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3555889Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3556110Z 2025-12-04T12:29:30.3556199Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3556385Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:30.3556547Z ======================= 1 failed, 1 deselected in 6.82s ======================== 2025-12-04T12:29:30.3556682Z Got exit code 1 2025-12-04T12:29:30.3556777Z Retrying single test... 2025-12-04T12:29:30.3557033Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-6a4db284eb052079.xml 2025-12-04T12:29:30.3557333Z ============================= test session starts ============================== 2025-12-04T12:29:30.3557541Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:30.3557729Z cachedir: .pytest_cache 2025-12-04T12:29:30.3557954Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:30.3558209Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:30.3558326Z configfile: pytest.ini 2025-12-04T12:29:30.3558550Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:30.3558817Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T12:29:30.3559102Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda 2025-12-04T12:29:30.3559356Z Running 1 items in this shard 2025-12-04T12:29:30.3559429Z 2025-12-04T12:29:30.3559737Z distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda I1204 12:29:11.736000 317671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 317740 2025-12-04T12:29:30.3560336Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T12:29:30.3560709Z _init_core_state( 2025-12-04T12:29:30.3562073Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:29:30.3563483Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:29:30.3563788Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:30.3564126Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:30.3564615Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3565098Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:30.3565578Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3566024Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:30.3566463Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3566942Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3567405Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3567887Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3568349Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3568796Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:30.3569262Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3569770Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:30.3570399Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3570976Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3571326Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3571869Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3572366Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3572732Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3573142Z [rank0]:E1204 12:29:17.014000 317740 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:30.3573382Z dist init r=0, world=1 2025-12-04T12:29:30.3573782Z [rank0]:[W1204 12:29:17.177114576 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:30.3574188Z FAILED [6.8120s] [100%] 2025-12-04T12:29:30.3574251Z 2025-12-04T12:29:30.3574307Z =================================== FAILURES =================================== 2025-12-04T12:29:30.3574487Z ___________________ TestInputCUDA.test_input_type_list_cuda ____________________ 2025-12-04T12:29:30.3574649Z Traceback (most recent call last): 2025-12-04T12:29:30.3574892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:30.3575133Z self._join_processes(fn) 2025-12-04T12:29:30.3575375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:30.3575655Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:30.3575923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:30.3576184Z raise RuntimeError(error) 2025-12-04T12:29:30.3576333Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3576505Z Traceback (most recent call last): 2025-12-04T12:29:30.3576745Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3576984Z getattr(self, test_name)() 2025-12-04T12:29:30.3577216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3577447Z fn() 2025-12-04T12:29:30.3577648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3577880Z method(*args, **kwargs) 2025-12-04T12:29:30.3578100Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3578327Z method(*args, **kwargs) 2025-12-04T12:29:30.3578546Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3578773Z with policy(): 2025-12-04T12:29:30.3578985Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3579215Z raise RuntimeError(msg) 2025-12-04T12:29:30.3579622Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3579960Z 2025-12-04T12:29:30.3580035Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3580330Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3580550Z 2025-12-04T12:29:30.3580637Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3580763Z 2025-12-04T12:29:30.3580765Z 2025-12-04T12:29:30.3580879Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:30.3581075Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:30.3581441Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-6a4db284eb052079.xml - 2025-12-04T12:29:30.3581771Z =========================== short test summary info ============================ 2025-12-04T12:29:30.3582076Z FAILED [6.8120s] distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3582353Z Traceback (most recent call last): 2025-12-04T12:29:30.3582597Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3582840Z getattr(self, test_name)() 2025-12-04T12:29:30.3583077Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3583308Z fn() 2025-12-04T12:29:30.3583510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3583738Z method(*args, **kwargs) 2025-12-04T12:29:30.3583958Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3584205Z method(*args, **kwargs) 2025-12-04T12:29:30.3584422Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3584648Z with policy(): 2025-12-04T12:29:30.3584858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3585113Z raise RuntimeError(msg) 2025-12-04T12:29:30.3585485Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3585823Z 2025-12-04T12:29:30.3585896Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3586192Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3586412Z 2025-12-04T12:29:30.3586499Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3586685Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:30.3586847Z ======================= 1 failed, 1 deselected in 6.82s ======================== 2025-12-04T12:29:30.3586983Z Got exit code 1 2025-12-04T12:29:30.3587077Z Retrying single test... 2025-12-04T12:29:30.3587338Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-c38a48976e3e60de.xml 2025-12-04T12:29:30.3587621Z ============================= test session starts ============================== 2025-12-04T12:29:30.3587831Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:30.3588018Z cachedir: .pytest_cache 2025-12-04T12:29:30.3588241Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:30.3588482Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:30.3588601Z configfile: pytest.ini 2025-12-04T12:29:30.3588830Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:30.3589099Z collecting ... collected 2 items / 1 deselected / 1 selected 2025-12-04T12:29:30.3589412Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda 2025-12-04T12:29:30.3589696Z Running 1 items in this shard 2025-12-04T12:29:30.3589769Z 2025-12-04T12:29:30.3590032Z distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda I1204 12:29:20.848000 317823 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 317892 2025-12-04T12:29:30.3590625Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T12:29:30.3590996Z _init_core_state( 2025-12-04T12:29:30.3592340Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:29:30.3593763Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:29:30.3594083Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:30.3594425Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:30.3594912Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3595393Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:30.3595873Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3596317Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:30.3596761Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3597222Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3597684Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3598144Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:30.3598635Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3599086Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:30.3599538Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3600043Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:30.3600669Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3601256Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3601611Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3602152Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3602637Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:30.3603001Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3603430Z [rank0]:E1204 12:29:26.078000 317892 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:30.3603671Z dist init r=0, world=1 2025-12-04T12:29:30.3604073Z [rank0]:[W1204 12:29:26.242068674 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:30.3604487Z FAILED [6.7124s] [100%] 2025-12-04T12:29:30.3604549Z 2025-12-04T12:29:30.3604608Z =================================== FAILURES =================================== 2025-12-04T12:29:30.3604783Z ___________________ TestInputCUDA.test_input_type_list_cuda ____________________ 2025-12-04T12:29:30.3604946Z Traceback (most recent call last): 2025-12-04T12:29:30.3605191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:30.3605433Z self._join_processes(fn) 2025-12-04T12:29:30.3605679Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:30.3605943Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:30.3606213Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:30.3606471Z raise RuntimeError(error) 2025-12-04T12:29:30.3606621Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3606783Z Traceback (most recent call last): 2025-12-04T12:29:30.3607023Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3607263Z getattr(self, test_name)() 2025-12-04T12:29:30.3607494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3607727Z fn() 2025-12-04T12:29:30.3607962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3608193Z method(*args, **kwargs) 2025-12-04T12:29:30.3608416Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3608645Z method(*args, **kwargs) 2025-12-04T12:29:30.3608861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3609088Z with policy(): 2025-12-04T12:29:30.3609300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3609530Z raise RuntimeError(msg) 2025-12-04T12:29:30.3609946Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3610287Z 2025-12-04T12:29:30.3610360Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3610653Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3610874Z 2025-12-04T12:29:30.3610962Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3611105Z 2025-12-04T12:29:30.3611107Z 2025-12-04T12:29:30.3611182Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:30.3611379Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:30.3611744Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-c38a48976e3e60de.xml - 2025-12-04T12:29:30.3612093Z =========================== short test summary info ============================ 2025-12-04T12:29:30.3612392Z FAILED [6.7124s] distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:30.3612669Z Traceback (most recent call last): 2025-12-04T12:29:30.3612913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:30.3613156Z getattr(self, test_name)() 2025-12-04T12:29:30.3613389Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:30.3613619Z fn() 2025-12-04T12:29:30.3613820Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3614049Z method(*args, **kwargs) 2025-12-04T12:29:30.3614269Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:30.3614496Z method(*args, **kwargs) 2025-12-04T12:29:30.3614713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:30.3614937Z with policy(): 2025-12-04T12:29:30.3615150Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:30.3615385Z raise RuntimeError(msg) 2025-12-04T12:29:30.3615757Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestInputCUDA.test_input_type_list_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 1633681408 and is now 2130706432. 2025-12-04T12:29:30.3616092Z 2025-12-04T12:29:30.3616166Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:30.3616500Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_input.py TestInputCUDA.test_input_type_list_cuda 2025-12-04T12:29:30.3616718Z 2025-12-04T12:29:30.3616807Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:30.3616992Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:30.3617153Z ======================= 1 failed, 1 deselected in 6.72s ======================== 2025-12-04T12:29:30.3617290Z Got exit code 1 2025-12-04T12:29:30.3617479Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda 2025-12-04T12:29:30.3617776Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:29:30.3618130Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-3088ee34454b69a4.xml 2025-12-04T12:29:30.3618416Z ============================= test session starts ============================== 2025-12-04T12:29:30.3618623Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:30.3618808Z cachedir: .pytest_cache 2025-12-04T12:29:30.3619028Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:30.3619264Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:30.3619380Z configfile: pytest.ini 2025-12-04T12:29:30.3619668Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:30.3619935Z collecting ... collected 2 items / 2 deselected / 0 selected 2025-12-04T12:29:30.3620092Z stepcurrent: skipping 2 already run items. 2025-12-04T12:29:30.3620219Z Running 0 items in this shard 2025-12-04T12:29:30.3620309Z 2025-12-04T12:29:30.3620552Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_input/distributed.fsdp.test_fsdp_input-3088ee34454b69a4.xml - 2025-12-04T12:29:30.3620879Z ============================ 2 deselected in 0.00s ============================= 2025-12-04T12:29:30.3621251Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_dict_cuda', 'test/distributed/fsdp/test_fsdp_input.py::TestInputCUDA::test_input_type_list_cuda'] 2025-12-04T12:29:30.3621559Z 2025-12-04T12:29:30.3621749Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_input 1/1 (test/test-reports/distributed.fsdp.test_fsdp_input_1.1_bc379566c9ef67b0_.log) 2025-12-04T12:29:30.3621970Z 2025-12-04T12:29:30.3622096Z Finished distributed/fsdp/test_fsdp_input 1/1 ... [2025-12-04 12:29:30.341923][5229411.320957789], took 0.95min 2025-12-04T12:29:30.3622521Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:29:30.3622921Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:29:30.3623135Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:29:30.3623314Z Uploading artifacts took 0.00 seconds 2025-12-04T12:29:30.3623447Z distributed/fsdp/test_fsdp_input 1/1 failed! 2025-12-04T12:29:30.3623652Z Running distributed/fsdp/test_fsdp_traversal 1/1 ... [2025-12-04 12:29:30.345159][5229411.324200936] 2025-12-04T12:29:30.3623854Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:29:30.3624259Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_traversal.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:29:30.345377] 2025-12-04T12:29:55.5412851Z 2025-12-04T12:29:55.5417008Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_traversal 1/1 (test/test-reports/distributed.fsdp.test_fsdp_traversal_1.1_61a5791dc0397606_.log) 2025-12-04T12:29:55.5417967Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-0465fcbb0894d830.xml 2025-12-04T12:29:55.5418521Z ============================= test session starts ============================== 2025-12-04T12:29:55.5418929Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:55.5419286Z cachedir: .pytest_cache 2025-12-04T12:29:55.5419756Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:55.5420194Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:55.5420404Z configfile: pytest.ini 2025-12-04T12:29:55.5420819Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:55.5421270Z collecting ... collected 1 item 2025-12-04T12:29:55.5421516Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:29:55.5422006Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2025-12-04T12:29:55.5422344Z 2025-12-04T12:29:55.5422852Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:29:32.120000 318043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 318112 2025-12-04T12:29:55.5423766Z I1204 12:29:32.120000 318043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 318113 2025-12-04T12:29:55.5424366Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:55.5424977Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:55.5425780Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5426523Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:55.5427178Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5427829Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:55.5436507Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5437038Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5437599Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5438107Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5438628Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5439117Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:55.5439784Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5440293Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:55.5440987Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:29:55.5441623Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5442015Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5442622Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5443131Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5443563Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5444016Z [rank1]:E1204 12:29:35.844000 318113 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:29:55.5444300Z dist init r=1, world=2 2025-12-04T12:29:55.5444535Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:55.5444902Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:55.5445402Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5445884Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:55.5446359Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5446805Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:55.5447247Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5447708Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5448174Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5448636Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5449125Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5449623Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:55.5450075Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5450545Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:55.5451166Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:29:55.5451748Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5452096Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5452646Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5453134Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5453494Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5453926Z [rank0]:E1204 12:29:35.857000 318112 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:55.5454171Z dist init r=0, world=2 2025-12-04T12:29:55.5454585Z [rank0]:[W1204 12:29:36.092753861 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:55.5454991Z FAILED [5.2119s] [100%] 2025-12-04T12:29:55.5455059Z 2025-12-04T12:29:55.5455118Z =================================== FAILURES =================================== 2025-12-04T12:29:55.5455300Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________ 2025-12-04T12:29:55.5455470Z Traceback (most recent call last): 2025-12-04T12:29:55.5455722Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:55.5455970Z self._join_processes(fn) 2025-12-04T12:29:55.5456219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:55.5456485Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:55.5456754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:55.5457014Z raise RuntimeError(error) 2025-12-04T12:29:55.5457167Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:55.5457331Z Traceback (most recent call last): 2025-12-04T12:29:55.5457570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5457811Z getattr(self, test_name)() 2025-12-04T12:29:55.5458080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5458316Z fn() 2025-12-04T12:29:55.5458519Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5458751Z method(*args, **kwargs) 2025-12-04T12:29:55.5458976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5459207Z method(*args, **kwargs) 2025-12-04T12:29:55.5459429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5459697Z with policy(): 2025-12-04T12:29:55.5459910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5460141Z raise RuntimeError(msg) 2025-12-04T12:29:55.5460524Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:29:55.5460869Z 2025-12-04T12:29:55.5460943Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5461247Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5461478Z 2025-12-04T12:29:55.5461586Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5461714Z 2025-12-04T12:29:55.5461773Z Process 1 exited with error code 10 and exception: 2025-12-04T12:29:55.5461913Z Traceback (most recent call last): 2025-12-04T12:29:55.5462155Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5462414Z getattr(self, test_name)() 2025-12-04T12:29:55.5462648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5462879Z fn() 2025-12-04T12:29:55.5463079Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5463308Z method(*args, **kwargs) 2025-12-04T12:29:55.5463527Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5463759Z method(*args, **kwargs) 2025-12-04T12:29:55.5463977Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5464202Z with policy(): 2025-12-04T12:29:55.5464413Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5464648Z raise RuntimeError(msg) 2025-12-04T12:29:55.5465025Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:29:55.5465364Z 2025-12-04T12:29:55.5465442Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5465740Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5465963Z 2025-12-04T12:29:55.5466053Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5466178Z 2025-12-04T12:29:55.5466180Z 2025-12-04T12:29:55.5466263Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:55.5466467Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:55.5466879Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-0465fcbb0894d830.xml - 2025-12-04T12:29:55.5467225Z =========================== short test summary info ============================ 2025-12-04T12:29:55.5467539Z FAILED [5.2119s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:55.5467828Z Traceback (most recent call last): 2025-12-04T12:29:55.5468074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5468319Z getattr(self, test_name)() 2025-12-04T12:29:55.5468551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5468785Z fn() 2025-12-04T12:29:55.5468991Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5469222Z method(*args, **kwargs) 2025-12-04T12:29:55.5469441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5469703Z method(*args, **kwargs) 2025-12-04T12:29:55.5469925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5470176Z with policy(): 2025-12-04T12:29:55.5470387Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5470618Z raise RuntimeError(msg) 2025-12-04T12:29:55.5470996Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:29:55.5471350Z 2025-12-04T12:29:55.5471424Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5471723Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5471948Z 2025-12-04T12:29:55.5472033Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5472156Z 2025-12-04T12:29:55.5472217Z Process 1 exited with error code 10 and exception: 2025-12-04T12:29:55.5472354Z Traceback (most recent call last): 2025-12-04T12:29:55.5472593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5472831Z getattr(self, test_name)() 2025-12-04T12:29:55.5473058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5473291Z fn() 2025-12-04T12:29:55.5473487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5473711Z method(*args, **kwargs) 2025-12-04T12:29:55.5473926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5474153Z method(*args, **kwargs) 2025-12-04T12:29:55.5474368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5474592Z with policy(): 2025-12-04T12:29:55.5474798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5475027Z raise RuntimeError(msg) 2025-12-04T12:29:55.5475437Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:29:55.5475776Z 2025-12-04T12:29:55.5475850Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5476145Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5476366Z 2025-12-04T12:29:55.5476456Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5476642Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:55.5476801Z ============================== 1 failed in 5.37s =============================== 2025-12-04T12:29:55.5476931Z Got exit code 1 2025-12-04T12:29:55.5477026Z Retrying single test... 2025-12-04T12:29:55.5477299Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-a7a57420ca89d548.xml 2025-12-04T12:29:55.5477595Z ============================= test session starts ============================== 2025-12-04T12:29:55.5477805Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:55.5477992Z cachedir: .pytest_cache 2025-12-04T12:29:55.5478212Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:55.5478463Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:55.5478580Z configfile: pytest.ini 2025-12-04T12:29:55.5478805Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:55.5479045Z collecting ... collected 1 item 2025-12-04T12:29:55.5479302Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2025-12-04T12:29:55.5479615Z Running 1 items in this shard 2025-12-04T12:29:55.5479687Z 2025-12-04T12:29:55.5479962Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:29:39.658000 318271 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 318340 2025-12-04T12:29:55.5480423Z I1204 12:29:39.659000 318271 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 318341 2025-12-04T12:29:55.5480757Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:55.5481097Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:55.5481587Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5482065Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:55.5482539Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5482980Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:55.5483416Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5483881Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5484374Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5484832Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5485290Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5485742Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:55.5486197Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5486659Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:55.5487277Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:29:55.5487872Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5488217Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5488782Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5489243Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5489641Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5490051Z [rank0]:E1204 12:29:43.421000 318340 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:55.5490288Z dist init r=0, world=2 2025-12-04T12:29:55.5490489Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:55.5490825Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:55.5491311Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5491784Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:55.5492261Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5492708Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:55.5493176Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5493636Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5494094Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5494554Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5495012Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5495464Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:55.5495916Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5496376Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:55.5496989Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:29:55.5497583Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5497947Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5498490Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5498954Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5499316Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5499773Z [rank1]:E1204 12:29:43.427000 318341 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:29:55.5500013Z dist init r=1, world=2 2025-12-04T12:29:55.5500409Z [rank0]:[W1204 12:29:43.578553682 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:55.5500814Z FAILED [5.1111s] [100%] 2025-12-04T12:29:55.5500876Z 2025-12-04T12:29:55.5500934Z =================================== FAILURES =================================== 2025-12-04T12:29:55.5501112Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________ 2025-12-04T12:29:55.5501275Z Traceback (most recent call last): 2025-12-04T12:29:55.5501517Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:55.5501759Z self._join_processes(fn) 2025-12-04T12:29:55.5502003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:55.5502324Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:55.5502589Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:55.5502848Z raise RuntimeError(error) 2025-12-04T12:29:55.5502995Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:55.5503155Z Traceback (most recent call last): 2025-12-04T12:29:55.5503393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5503633Z getattr(self, test_name)() 2025-12-04T12:29:55.5503863Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5504090Z fn() 2025-12-04T12:29:55.5504291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5504522Z method(*args, **kwargs) 2025-12-04T12:29:55.5504740Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5504966Z method(*args, **kwargs) 2025-12-04T12:29:55.5505183Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5505407Z with policy(): 2025-12-04T12:29:55.5505635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5505863Z raise RuntimeError(msg) 2025-12-04T12:29:55.5506236Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:29:55.5506594Z 2025-12-04T12:29:55.5506671Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5506972Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5507196Z 2025-12-04T12:29:55.5507283Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5507407Z 2025-12-04T12:29:55.5507409Z 2025-12-04T12:29:55.5507484Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:55.5507684Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:55.5508061Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-a7a57420ca89d548.xml - 2025-12-04T12:29:55.5508401Z =========================== short test summary info ============================ 2025-12-04T12:29:55.5508710Z FAILED [5.1111s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:29:55.5508996Z Traceback (most recent call last): 2025-12-04T12:29:55.5509236Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5509478Z getattr(self, test_name)() 2025-12-04T12:29:55.5509757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5509989Z fn() 2025-12-04T12:29:55.5510188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5510417Z method(*args, **kwargs) 2025-12-04T12:29:55.5510635Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5510900Z method(*args, **kwargs) 2025-12-04T12:29:55.5511118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5511340Z with policy(): 2025-12-04T12:29:55.5511549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5511776Z raise RuntimeError(msg) 2025-12-04T12:29:55.5512149Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:29:55.5512491Z 2025-12-04T12:29:55.5512564Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5512866Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5513092Z 2025-12-04T12:29:55.5513178Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5513367Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:55.5513524Z ============================== 1 failed in 5.27s =============================== 2025-12-04T12:29:55.5513653Z Got exit code 1 2025-12-04T12:29:55.5513748Z Retrying single test... 2025-12-04T12:29:55.5514015Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-fcbdb9cd508712d5.xml 2025-12-04T12:29:55.5514328Z ============================= test session starts ============================== 2025-12-04T12:29:55.5514537Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:55.5514741Z cachedir: .pytest_cache 2025-12-04T12:29:55.5514965Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:55.5515203Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:55.5515320Z configfile: pytest.ini 2025-12-04T12:29:55.5515544Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:55.5515787Z collecting ... collected 1 item 2025-12-04T12:29:55.5516046Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2025-12-04T12:29:55.5516307Z Running 1 items in this shard 2025-12-04T12:29:55.5516380Z 2025-12-04T12:29:55.5516652Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:29:47.249000 318499 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 318568 2025-12-04T12:29:55.5517114Z I1204 12:29:47.249000 318499 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 318569 2025-12-04T12:29:55.5517442Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:55.5517779Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:55.5518263Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5518737Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:55.5519240Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5519724Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:55.5520160Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5520620Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5521084Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5521545Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5522011Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5522465Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:55.5522917Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5523398Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:55.5524015Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:29:55.5524608Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5524952Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5525495Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5525957Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5526321Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5526731Z [rank1]:E1204 12:29:51.027000 318569 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:29:55.5526968Z dist init r=1, world=2 2025-12-04T12:29:55.5527167Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:29:55.5527500Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:29:55.5527981Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5528489Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:29:55.5528965Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5529407Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:29:55.5529879Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5530342Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5530808Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5531270Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:29:55.5531729Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5532191Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:29:55.5532639Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5533164Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:29:55.5533779Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528. 2025-12-04T12:29:55.5534356Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5534701Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5535248Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5535717Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:29:55.5536080Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5536490Z [rank0]:E1204 12:29:51.054000 318568 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:29:55.5536728Z dist init r=0, world=2 2025-12-04T12:29:55.5537260Z [rank0]:[W1204 12:29:51.244603609 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:29:55.5537664Z FAILED [5.2120s] [100%] 2025-12-04T12:29:55.5537726Z 2025-12-04T12:29:55.5537826Z =================================== FAILURES =================================== 2025-12-04T12:29:55.5538004Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________ 2025-12-04T12:29:55.5538169Z Traceback (most recent call last): 2025-12-04T12:29:55.5538412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:29:55.5538651Z self._join_processes(fn) 2025-12-04T12:29:55.5538892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:29:55.5539150Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:29:55.5539413Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:29:55.5539708Z raise RuntimeError(error) 2025-12-04T12:29:55.5539857Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:29:55.5540018Z Traceback (most recent call last): 2025-12-04T12:29:55.5540253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5540491Z getattr(self, test_name)() 2025-12-04T12:29:55.5540720Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5540949Z fn() 2025-12-04T12:29:55.5541149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5541392Z method(*args, **kwargs) 2025-12-04T12:29:55.5541609Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5541835Z method(*args, **kwargs) 2025-12-04T12:29:55.5542065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5542290Z with policy(): 2025-12-04T12:29:55.5542497Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5542721Z raise RuntimeError(msg) 2025-12-04T12:29:55.5543089Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:29:55.5543430Z 2025-12-04T12:29:55.5543503Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5543801Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5544023Z 2025-12-04T12:29:55.5544111Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5544232Z 2025-12-04T12:29:55.5544235Z 2025-12-04T12:29:55.5544311Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:29:55.5544506Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:29:55.5544884Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-fcbdb9cd508712d5.xml - 2025-12-04T12:29:55.5545228Z =========================== short test summary info ============================ 2025-12-04T12:29:55.5545536Z FAILED [5.2120s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:29:55.5545821Z Traceback (most recent call last): 2025-12-04T12:29:55.5546067Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:29:55.5546312Z getattr(self, test_name)() 2025-12-04T12:29:55.5546576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:29:55.5546810Z fn() 2025-12-04T12:29:55.5547009Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5547236Z method(*args, **kwargs) 2025-12-04T12:29:55.5547454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:29:55.5547682Z method(*args, **kwargs) 2025-12-04T12:29:55.5547898Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:29:55.5548121Z with policy(): 2025-12-04T12:29:55.5548332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:29:55.5548565Z raise RuntimeError(msg) 2025-12-04T12:29:55.5548937Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432. 2025-12-04T12:29:55.5549279Z 2025-12-04T12:29:55.5549353Z To execute this test, run the following from the base repo dir: 2025-12-04T12:29:55.5549691Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda 2025-12-04T12:29:55.5549938Z 2025-12-04T12:29:55.5550024Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:29:55.5550210Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:29:55.5550366Z ============================== 1 failed in 5.35s =============================== 2025-12-04T12:29:55.5550511Z Got exit code 1 2025-12-04T12:29:55.5550710Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda 2025-12-04T12:29:55.5551013Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:29:55.5551378Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-6360f1f260e2dc0b.xml 2025-12-04T12:29:55.5551674Z ============================= test session starts ============================== 2025-12-04T12:29:55.5551885Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:29:55.5552074Z cachedir: .pytest_cache 2025-12-04T12:29:55.5552300Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:29:55.5552538Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:29:55.5552656Z configfile: pytest.ini 2025-12-04T12:29:55.5552884Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:29:55.5553150Z collecting ... collected 1 item / 1 deselected / 0 selected 2025-12-04T12:29:55.5553307Z stepcurrent: skipping 1 already run items. 2025-12-04T12:29:55.5553433Z Running 0 items in this shard 2025-12-04T12:29:55.5553508Z 2025-12-04T12:29:55.5553755Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-6360f1f260e2dc0b.xml - 2025-12-04T12:29:55.5554097Z ============================ 1 deselected in 0.00s ============================= 2025-12-04T12:29:55.5554360Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda'] 2025-12-04T12:29:55.5554563Z 2025-12-04T12:29:55.5554791Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_traversal 1/1 (test/test-reports/distributed.fsdp.test_fsdp_traversal_1.1_61a5791dc0397606_.log) 2025-12-04T12:29:55.5555021Z 2025-12-04T12:29:55.5555153Z Finished distributed/fsdp/test_fsdp_traversal 1/1 ... [2025-12-04 12:29:55.541341][5229436.520378029], took 0.42min 2025-12-04T12:29:55.5555594Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:29:55.5555980Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:29:55.5556196Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:29:55.5556377Z Uploading artifacts took 0.00 seconds 2025-12-04T12:29:55.5556517Z distributed/fsdp/test_fsdp_traversal 1/1 failed! 2025-12-04T12:29:55.5556731Z Running distributed/fsdp/test_fsdp_ignored_modules 1/1 ... [2025-12-04 12:29:55.544353][5229436.523394849] 2025-12-04T12:29:55.5556937Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:29:55.5557360Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_ignored_modules.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:29:55.544542] 2025-12-04T12:30:48.0842556Z 2025-12-04T12:30:48.0843173Z distributed/fsdp/test_fsdp_ignored_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_ignored_modules_1.1_82e52ac70f0a3012_.log 2025-12-04T12:30:48.0845366Z Running 8 items in this shard: test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_diff_ignored_modules_across_ranks, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_invalid, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_nested, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_not_under_wrapped_root_ignore_modules_False, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_not_under_wrapped_root_ignore_modules_True, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_transformer, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_states_auto_wrap, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_states_check 2025-12-04T12:30:48.0846777Z 2025-12-04T12:30:48.0846931Z Finished distributed/fsdp/test_fsdp_ignored_modules 1/1 ... [2025-12-04 12:30:48.083941][5229489.062979017], took 0.88min 2025-12-04T12:30:48.0861304Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:30:48.0872616Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:30:48.0874880Z Running distributed/fsdp/test_checkpoint_wrapper 1/1 ... [2025-12-04 12:30:48.087390][5229489.066431092] 2025-12-04T12:30:48.0875100Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:30:48.0876838Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_checkpoint_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:30:48.087586] 2025-12-04T12:30:53.2107301Z 2025-12-04T12:30:53.2107937Z distributed/fsdp/test_checkpoint_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_checkpoint_wrapper_1.1_3876383eab787904_.log 2025-12-04T12:30:53.2110133Z Running 8 items in this shard: test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_apply_activation_checkpointing, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_args_kwargs, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_cpu_offload, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_kwarg_support, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_parity, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_forward_missing_attributes, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_fqn, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_load_activation_checkpointed_module 2025-12-04T12:30:53.2111419Z 2025-12-04T12:30:53.2111574Z Finished distributed/fsdp/test_checkpoint_wrapper 1/1 ... [2025-12-04 12:30:53.210438][5229494.189476779], took 0.09min 2025-12-04T12:30:53.2124918Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:30:53.2135308Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:30:53.2137541Z Running distributed/fsdp/test_fsdp_checkpoint 1/1 ... [2025-12-04 12:30:53.213678][5229494.192719747] 2025-12-04T12:30:53.2137759Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:30:53.2139732Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:30:53.213871] 2025-12-04T12:33:42.6898536Z 2025-12-04T12:33:42.6899739Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_checkpoint 1/1 (test/test-reports/distributed.fsdp.test_fsdp_checkpoint_1.1_c244fb9a4f737098_.log) 2025-12-04T12:33:42.6901462Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-ef762e98490b0b73.xml 2025-12-04T12:33:42.6902210Z ============================= test session starts ============================== 2025-12-04T12:33:42.6902714Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:33:42.6903161Z cachedir: .pytest_cache 2025-12-04T12:33:42.6903648Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:42.6904159Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:42.6904401Z configfile: pytest.ini 2025-12-04T12:33:42.6904868Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:42.6905959Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py) 2025-12-04T12:33:42.6906815Z class TestModel(nn.Module): 2025-12-04T12:33:42.6907040Z collected 17 items 2025-12-04T12:33:42.6907279Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:33:42.6914441Z Running 17 items in this shard: test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.6919814Z 2025-12-04T12:33:42.6920426Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_False I1204 12:30:54.991000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 320442 2025-12-04T12:33:42.6921212Z I1204 12:30:54.992000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 320443 2025-12-04T12:33:42.6921703Z I1204 12:30:54.992000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 320444 2025-12-04T12:33:42.6922184Z I1204 12:30:54.993000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 320445 2025-12-04T12:33:42.6922875Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6923367Z return func(*args, **kwargs) 2025-12-04T12:33:42.6923494Z dist init r=0, world=4 2025-12-04T12:33:42.6923605Z dist init r=3, world=4 2025-12-04T12:33:42.6923712Z dist init r=1, world=4 2025-12-04T12:33:42.6923817Z dist init r=2, world=4 2025-12-04T12:33:42.6923924Z PASSED [8.7186s] [ 5%] 2025-12-04T12:33:42.6924394Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_True I1204 12:31:03.713000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 320775 2025-12-04T12:33:42.6924999Z I1204 12:31:03.714000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 320776 2025-12-04T12:33:42.6925375Z I1204 12:31:03.714000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 320777 2025-12-04T12:33:42.6925807Z I1204 12:31:03.715000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 320778 2025-12-04T12:33:42.6926331Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6926736Z return func(*args, **kwargs) 2025-12-04T12:33:42.6926857Z dist init r=0, world=4 2025-12-04T12:33:42.6926964Z dist init r=3, world=4 2025-12-04T12:33:42.6927074Z dist init r=1, world=4 2025-12-04T12:33:42.6927180Z dist init r=2, world=4 2025-12-04T12:33:42.6927286Z PASSED [8.1161s] [ 11%] 2025-12-04T12:33:42.6927752Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_False I1204 12:31:11.830000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 321108 2025-12-04T12:33:42.6928361Z I1204 12:31:11.831000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 321109 2025-12-04T12:33:42.6928734Z I1204 12:31:11.831000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 321110 2025-12-04T12:33:42.6929106Z I1204 12:31:11.832000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 321111 2025-12-04T12:33:42.6929668Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6930103Z return func(*args, **kwargs) 2025-12-04T12:33:42.6930223Z dist init r=0, world=4 2025-12-04T12:33:42.6930330Z dist init r=3, world=4 2025-12-04T12:33:42.6930437Z dist init r=1, world=4 2025-12-04T12:33:42.6930542Z dist init r=2, world=4 2025-12-04T12:33:42.6930668Z PASSED [8.3163s] [ 17%] 2025-12-04T12:33:42.6931131Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_True I1204 12:31:20.148000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 321441 2025-12-04T12:33:42.6931729Z I1204 12:31:20.149000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 321442 2025-12-04T12:33:42.6932103Z I1204 12:31:20.150000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 321443 2025-12-04T12:33:42.6932475Z I1204 12:31:20.150000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 321444 2025-12-04T12:33:42.6932997Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6933385Z return func(*args, **kwargs) 2025-12-04T12:33:42.6933496Z dist init r=0, world=4 2025-12-04T12:33:42.6933594Z dist init r=3, world=4 2025-12-04T12:33:42.6933690Z dist init r=2, world=4 2025-12-04T12:33:42.6933785Z dist init r=1, world=4 2025-12-04T12:33:42.6933881Z PASSED [8.3165s] [ 23%] 2025-12-04T12:33:42.6934301Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_False I1204 12:31:28.466000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 321774 2025-12-04T12:33:42.6934849Z I1204 12:31:28.467000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 321775 2025-12-04T12:33:42.6935186Z I1204 12:31:28.468000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 321776 2025-12-04T12:33:42.6935567Z I1204 12:31:28.468000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 321777 2025-12-04T12:33:42.6936040Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6936402Z return func(*args, **kwargs) 2025-12-04T12:33:42.6936510Z dist init r=0, world=4 2025-12-04T12:33:42.6936607Z dist init r=3, world=4 2025-12-04T12:33:42.6936703Z dist init r=2, world=4 2025-12-04T12:33:42.6936801Z dist init r=1, world=4 2025-12-04T12:33:42.6936898Z PASSED [8.4160s] [ 29%] 2025-12-04T12:33:42.6937316Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_True I1204 12:31:36.884000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 322107 2025-12-04T12:33:42.6937867Z I1204 12:31:36.885000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 322108 2025-12-04T12:33:42.6938205Z I1204 12:31:36.885000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 322109 2025-12-04T12:33:42.6938541Z I1204 12:31:36.885000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 322110 2025-12-04T12:33:42.6939014Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6939392Z return func(*args, **kwargs) 2025-12-04T12:33:42.6939501Z dist init r=0, world=4 2025-12-04T12:33:42.6939651Z dist init r=3, world=4 2025-12-04T12:33:42.6939747Z dist init r=2, world=4 2025-12-04T12:33:42.6939846Z dist init r=1, world=4 2025-12-04T12:33:42.6939960Z PASSED [8.2159s] [ 35%] 2025-12-04T12:33:42.6940382Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_False I1204 12:31:45.102000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 322440 2025-12-04T12:33:42.6940924Z I1204 12:31:45.102000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 322441 2025-12-04T12:33:42.6941262Z I1204 12:31:45.102000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 322442 2025-12-04T12:33:42.6941602Z I1204 12:31:45.103000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 322443 2025-12-04T12:33:42.6942074Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6942441Z return func(*args, **kwargs) 2025-12-04T12:33:42.6942549Z dist init r=0, world=4 2025-12-04T12:33:42.6942647Z dist init r=3, world=4 2025-12-04T12:33:42.6942745Z dist init r=1, world=4 2025-12-04T12:33:42.6942841Z dist init r=2, world=4 2025-12-04T12:33:42.6942938Z PASSED [8.1161s] [ 41%] 2025-12-04T12:33:42.6943353Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_True I1204 12:31:53.219000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 322773 2025-12-04T12:33:42.6943896Z I1204 12:31:53.220000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 322774 2025-12-04T12:33:42.6944232Z I1204 12:31:53.220000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 322775 2025-12-04T12:33:42.6944568Z I1204 12:31:53.220000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 322776 2025-12-04T12:33:42.6945089Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6945454Z return func(*args, **kwargs) 2025-12-04T12:33:42.6945559Z dist init r=0, world=4 2025-12-04T12:33:42.6945659Z dist init r=3, world=4 2025-12-04T12:33:42.6945756Z dist init r=1, world=4 2025-12-04T12:33:42.6945853Z dist init r=2, world=4 2025-12-04T12:33:42.6945950Z PASSED [8.4152s] [ 47%] 2025-12-04T12:33:42.6946369Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_False I1204 12:32:01.636000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 323106 2025-12-04T12:33:42.6946921Z I1204 12:32:01.636000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 323107 2025-12-04T12:33:42.6947259Z I1204 12:32:01.637000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 323108 2025-12-04T12:33:42.6947602Z I1204 12:32:01.637000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 323109 2025-12-04T12:33:42.6948076Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6948456Z return func(*args, **kwargs) 2025-12-04T12:33:42.6948564Z dist init r=0, world=4 2025-12-04T12:33:42.6948661Z dist init r=3, world=4 2025-12-04T12:33:42.6948759Z dist init r=1, world=4 2025-12-04T12:33:42.6948852Z dist init r=2, world=4 2025-12-04T12:33:42.6948948Z PASSED [8.0162s] [ 52%] 2025-12-04T12:33:42.6949388Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_True I1204 12:32:09.653000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 323439 2025-12-04T12:33:42.6949969Z I1204 12:32:09.654000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 323440 2025-12-04T12:33:42.6950308Z I1204 12:32:09.654000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 323441 2025-12-04T12:33:42.6950645Z I1204 12:32:09.655000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 323442 2025-12-04T12:33:42.6951116Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6951483Z return func(*args, **kwargs) 2025-12-04T12:33:42.6951592Z dist init r=0, world=4 2025-12-04T12:33:42.6951691Z dist init r=3, world=4 2025-12-04T12:33:42.6951789Z dist init r=1, world=4 2025-12-04T12:33:42.6951884Z dist init r=2, world=4 2025-12-04T12:33:42.6951981Z PASSED [8.2152s] [ 58%] 2025-12-04T12:33:42.6952402Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_False I1204 12:32:17.870000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 323772 2025-12-04T12:33:42.6952949Z I1204 12:32:17.871000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 323773 2025-12-04T12:33:42.6953287Z I1204 12:32:17.871000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 323774 2025-12-04T12:33:42.6953624Z I1204 12:32:17.872000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 323775 2025-12-04T12:33:42.6954137Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6954500Z return func(*args, **kwargs) 2025-12-04T12:33:42.6954608Z dist init r=0, world=4 2025-12-04T12:33:42.6954707Z dist init r=3, world=4 2025-12-04T12:33:42.6954803Z dist init r=2, world=4 2025-12-04T12:33:42.6954897Z dist init r=1, world=4 2025-12-04T12:33:42.6954996Z PASSED [8.3157s] [ 64%] 2025-12-04T12:33:42.6955414Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_True I1204 12:32:26.187000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 324105 2025-12-04T12:33:42.6955956Z I1204 12:32:26.188000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 324106 2025-12-04T12:33:42.6956299Z I1204 12:32:26.188000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 324107 2025-12-04T12:33:42.6956638Z I1204 12:32:26.189000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 324108 2025-12-04T12:33:42.6957111Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6957490Z return func(*args, **kwargs) 2025-12-04T12:33:42.6957600Z dist init r=0, world=4 2025-12-04T12:33:42.6957698Z dist init r=3, world=4 2025-12-04T12:33:42.6957796Z dist init r=1, world=4 2025-12-04T12:33:42.6957893Z dist init r=2, world=4 2025-12-04T12:33:42.6957991Z PASSED [8.4165s] [ 70%] 2025-12-04T12:33:42.6958431Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_False I1204 12:32:34.606000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 324438 2025-12-04T12:33:42.6958975Z I1204 12:32:34.606000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 324439 2025-12-04T12:33:42.6959314Z I1204 12:32:34.607000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 324440 2025-12-04T12:33:42.6959702Z I1204 12:32:34.607000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 324441 2025-12-04T12:33:42.6960179Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6960546Z return func(*args, **kwargs) 2025-12-04T12:33:42.6960655Z dist init r=0, world=4 2025-12-04T12:33:42.6960758Z dist init r=3, world=4 2025-12-04T12:33:42.6960854Z dist init r=1, world=4 2025-12-04T12:33:42.6960951Z dist init r=2, world=4 2025-12-04T12:33:42.6961048Z PASSED [8.2167s] [ 76%] 2025-12-04T12:33:42.6961476Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_True I1204 12:32:42.824000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 324771 2025-12-04T12:33:42.6962026Z I1204 12:32:42.824000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 324772 2025-12-04T12:33:42.6962366Z I1204 12:32:42.825000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 324773 2025-12-04T12:33:42.6962704Z I1204 12:32:42.825000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 324774 2025-12-04T12:33:42.6963214Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6963576Z return func(*args, **kwargs) 2025-12-04T12:33:42.6963687Z dist init r=0, world=4 2025-12-04T12:33:42.6963785Z dist init r=3, world=4 2025-12-04T12:33:42.6963881Z dist init r=1, world=4 2025-12-04T12:33:42.6963977Z dist init r=2, world=4 2025-12-04T12:33:42.6964074Z PASSED [8.4173s] [ 82%] 2025-12-04T12:33:42.6964495Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_False I1204 12:32:51.243000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 325104 2025-12-04T12:33:42.6965040Z I1204 12:32:51.243000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 325105 2025-12-04T12:33:42.6965382Z I1204 12:32:51.244000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 325106 2025-12-04T12:33:42.6965719Z I1204 12:32:51.244000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 325107 2025-12-04T12:33:42.6966191Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6966573Z return func(*args, **kwargs) 2025-12-04T12:33:42.6966682Z dist init r=0, world=4 2025-12-04T12:33:42.6966780Z dist init r=3, world=4 2025-12-04T12:33:42.6966876Z dist init r=1, world=4 2025-12-04T12:33:42.6966973Z dist init r=2, world=4 2025-12-04T12:33:42.6967070Z PASSED [8.2160s] [ 88%] 2025-12-04T12:33:42.6967579Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_True I1204 12:32:59.460000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 325437 2025-12-04T12:33:42.6968176Z I1204 12:32:59.461000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 325438 2025-12-04T12:33:42.6968638Z I1204 12:32:59.461000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 325439 2025-12-04T12:33:42.6969028Z I1204 12:32:59.462000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 325440 2025-12-04T12:33:42.6969544Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:33:42.6977796Z return func(*args, **kwargs) 2025-12-04T12:33:42.6977938Z dist init r=0, world=4 2025-12-04T12:33:42.6978048Z dist init r=3, world=4 2025-12-04T12:33:42.6978151Z dist init r=2, world=4 2025-12-04T12:33:42.6978253Z dist init r=1, world=4 2025-12-04T12:33:42.6978366Z PASSED [8.0142s] [ 94%] 2025-12-04T12:33:42.6978787Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:33:07.476000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 325770 2025-12-04T12:33:42.6979328Z I1204 12:33:07.477000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 325771 2025-12-04T12:33:42.6979729Z I1204 12:33:07.478000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 325772 2025-12-04T12:33:42.6980079Z I1204 12:33:07.478000 320373 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 325773 2025-12-04T12:33:42.6980653Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6981058Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.6981689Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.6982285Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.6982663Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6983065Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.6983677Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.6984289Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.6984666Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6985069Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.6985482Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6985880Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.6986282Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6986677Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.6987075Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6987462Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.6987857Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6988255Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.6988650Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6989039Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.6989434Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6989868Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.6990527Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.6991124Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.6991497Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6991885Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.6992275Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6992665Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.6993066Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6993459Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.6993857Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6994275Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.6994892Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.6995500Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.6995870Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6996257Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.6996649Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6997044Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.6997440Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.6997832Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.6999268Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7000731Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7002159Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7003572Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7004995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7006432Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7007845Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7009253Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7009565Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7009960Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7010494Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7010984Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7011473Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7011928Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7012376Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7012852Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7013326Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7013797Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7014290Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7014754Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7015235Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7015705Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7016407Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:33:42.7017073Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7017437Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7018067Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7018613Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7018985Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7019405Z [rank2]:E1204 12:33:14.612000 325772 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:33:42.7019686Z dist init r=2, world=4 2025-12-04T12:33:42.7019928Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7020275Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7020767Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7021256Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7021742Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7022195Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7022641Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7023111Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7023596Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7024061Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7024547Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7025001Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7025461Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7025934Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7026629Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:33:42.7027283Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7027643Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7028268Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7028807Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7029201Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7029660Z [rank3]:E1204 12:33:14.629000 325773 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:33:42.7029910Z dist init r=3, world=4 2025-12-04T12:33:42.7030119Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7030464Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7030963Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7031451Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7031937Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7032394Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7032838Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7033324Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7033789Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7034290Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7034751Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7035201Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7035658Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7036129Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7036820Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880. 2025-12-04T12:33:42.7037468Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7037818Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7038472Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7039007Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7039370Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7039824Z [rank1]:E1204 12:33:14.646000 325771 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:42.7040068Z dist init r=1, world=4 2025-12-04T12:33:42.7040268Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7040604Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7041095Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7041575Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7042052Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7042518Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7042955Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7043433Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7043898Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7044360Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7044824Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7045277Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7045744Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7046215Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7046904Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216. 2025-12-04T12:33:42.7047568Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7047955Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7048578Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7049111Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7049473Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7049932Z [rank0]:E1204 12:33:14.684000 325770 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:42.7050173Z dist init r=0, world=4 2025-12-04T12:33:42.7050602Z [rank0]:[W1204 12:33:14.946474981 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:42.7051012Z FAILED [9.0166s] [100%] 2025-12-04T12:33:42.7051080Z 2025-12-04T12:33:42.7051139Z =================================== FAILURES =================================== 2025-12-04T12:33:42.7051355Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _ 2025-12-04T12:33:42.7051577Z Traceback (most recent call last): 2025-12-04T12:33:42.7051829Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:42.7052077Z self._join_processes(fn) 2025-12-04T12:33:42.7052327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:42.7052610Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:42.7052884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:42.7053148Z raise RuntimeError(error) 2025-12-04T12:33:42.7053304Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:33:42.7053467Z Traceback (most recent call last): 2025-12-04T12:33:42.7053715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7053959Z getattr(self, test_name)() 2025-12-04T12:33:42.7054195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7054430Z fn() 2025-12-04T12:33:42.7054638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7054873Z method(*args, **kwargs) 2025-12-04T12:33:42.7055100Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7055333Z method(*args, **kwargs) 2025-12-04T12:33:42.7055549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7055771Z with policy(): 2025-12-04T12:33:42.7055980Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7056208Z raise RuntimeError(msg) 2025-12-04T12:33:42.7056649Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:33:42.7057092Z 2025-12-04T12:33:42.7057167Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7057539Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7057833Z 2025-12-04T12:33:42.7057923Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7058047Z 2025-12-04T12:33:42.7058105Z Process 3 exited with error code 10 and exception: 2025-12-04T12:33:42.7058241Z Traceback (most recent call last): 2025-12-04T12:33:42.7058480Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7058719Z getattr(self, test_name)() 2025-12-04T12:33:42.7058955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7059185Z fn() 2025-12-04T12:33:42.7059383Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7059640Z method(*args, **kwargs) 2025-12-04T12:33:42.7059856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7060082Z method(*args, **kwargs) 2025-12-04T12:33:42.7060317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7060539Z with policy(): 2025-12-04T12:33:42.7060746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7060974Z raise RuntimeError(msg) 2025-12-04T12:33:42.7061432Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:33:42.7061838Z 2025-12-04T12:33:42.7061912Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7062278Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7062574Z 2025-12-04T12:33:42.7062663Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7062785Z 2025-12-04T12:33:42.7062786Z 2025-12-04T12:33:42.7062866Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:42.7063066Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:42.7063445Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-ef762e98490b0b73.xml - 2025-12-04T12:33:42.7063789Z =========================== short test summary info ============================ 2025-12-04T12:33:42.7064166Z FAILED [9.0166s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:33:42.7064524Z Traceback (most recent call last): 2025-12-04T12:33:42.7064766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7065006Z getattr(self, test_name)() 2025-12-04T12:33:42.7065237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7065467Z fn() 2025-12-04T12:33:42.7065701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7065927Z method(*args, **kwargs) 2025-12-04T12:33:42.7066144Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7066369Z method(*args, **kwargs) 2025-12-04T12:33:42.7066585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7066811Z with policy(): 2025-12-04T12:33:42.7067018Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7067246Z raise RuntimeError(msg) 2025-12-04T12:33:42.7067689Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:33:42.7068100Z 2025-12-04T12:33:42.7068172Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7068540Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7068832Z 2025-12-04T12:33:42.7068932Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7069057Z 2025-12-04T12:33:42.7069113Z Process 3 exited with error code 10 and exception: 2025-12-04T12:33:42.7069249Z Traceback (most recent call last): 2025-12-04T12:33:42.7069490Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7069785Z getattr(self, test_name)() 2025-12-04T12:33:42.7070018Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7070246Z fn() 2025-12-04T12:33:42.7070443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7070668Z method(*args, **kwargs) 2025-12-04T12:33:42.7070885Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7071111Z method(*args, **kwargs) 2025-12-04T12:33:42.7071325Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7071548Z with policy(): 2025-12-04T12:33:42.7071754Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7071984Z raise RuntimeError(msg) 2025-12-04T12:33:42.7072426Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:33:42.7072834Z 2025-12-04T12:33:42.7072907Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7073279Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7073571Z 2025-12-04T12:33:42.7073659Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7073844Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:42.7074012Z =================== 1 failed, 16 passed in 141.51s (0:02:21) =================== 2025-12-04T12:33:42.7074181Z Got exit code 1 2025-12-04T12:33:42.7074274Z Retrying single test... 2025-12-04T12:33:42.7074549Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-d1aea10b4f99f99c.xml 2025-12-04T12:33:42.7074854Z ============================= test session starts ============================== 2025-12-04T12:33:42.7075063Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:33:42.7075250Z cachedir: .pytest_cache 2025-12-04T12:33:42.7075473Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:42.7075708Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:42.7075823Z configfile: pytest.ini 2025-12-04T12:33:42.7076046Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:42.7076589Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py) 2025-12-04T12:33:42.7076993Z class TestModel(nn.Module): 2025-12-04T12:33:42.7077116Z collected 17 items / 16 deselected / 1 selected 2025-12-04T12:33:42.7077453Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7077794Z Running 1 items in this shard 2025-12-04T12:33:42.7077864Z 2025-12-04T12:33:42.7078202Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:33:19.069000 326103 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 326172 2025-12-04T12:33:42.7078743Z I1204 12:33:19.070000 326103 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 326173 2025-12-04T12:33:42.7079082Z I1204 12:33:19.070000 326103 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 326174 2025-12-04T12:33:42.7079419Z I1204 12:33:19.071000 326103 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 326175 2025-12-04T12:33:42.7079921Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7080311Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7080936Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.7081519Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.7081882Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7082263Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7082649Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7083033Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7083458Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7083841Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7084232Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7084609Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7085397Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.7085986Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.7086351Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7086729Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7087116Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7087517Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7087902Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7088299Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7088686Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7089063Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7089713Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.7090303Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.7090674Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7091053Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7091432Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7091811Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7092199Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7092579Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7092967Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7093381Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7093981Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.7094560Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.7094922Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7095298Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7095677Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7096058Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7096442Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7096841Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7098224Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7099700Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7101112Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7102510Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7103941Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7105334Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7106738Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7108156Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7108455Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7108792Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7109278Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7109793Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7110272Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7110718Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7111155Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7111618Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7112081Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7112578Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7113038Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7113486Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7113937Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7114399Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7115092Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880. 2025-12-04T12:33:42.7115735Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7116081Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7116711Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7117267Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7117629Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7118036Z [rank1]:E1204 12:33:26.171000 326173 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:42.7118276Z dist init r=1, world=4 2025-12-04T12:33:42.7118475Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7118808Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7119288Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7119858Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7120328Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7120770Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7121208Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7121666Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7122156Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7122620Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7123082Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7123534Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7123997Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7124464Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7125154Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:33:42.7125818Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7126167Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7126803Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7127334Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7127700Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7128115Z [rank3]:E1204 12:33:26.193000 326175 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:33:42.7128356Z dist init r=3, world=4 2025-12-04T12:33:42.7128560Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7128898Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7129386Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7129903Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7130385Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7130840Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7131309Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7131775Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7132238Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7132703Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7133168Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7133625Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7134078Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7134542Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7135248Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:33:42.7136169Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7136518Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7137141Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7137679Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7138045Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7138462Z [rank2]:E1204 12:33:26.239000 326174 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:33:42.7138704Z dist init r=2, world=4 2025-12-04T12:33:42.7138907Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7139245Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7139769Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7140250Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7140763Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7141212Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7141654Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7142119Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7142585Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7143053Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7143517Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7143974Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7144432Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7144912Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7145617Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216. 2025-12-04T12:33:42.7146267Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7146618Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7147238Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7147776Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7148141Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7148556Z [rank0]:E1204 12:33:26.269000 326172 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:42.7148798Z dist init r=0, world=4 2025-12-04T12:33:42.7149204Z [rank0]:[W1204 12:33:26.515633729 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:42.7149666Z FAILED [9.0192s] [100%] 2025-12-04T12:33:42.7149734Z 2025-12-04T12:33:42.7149793Z =================================== FAILURES =================================== 2025-12-04T12:33:42.7150034Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _ 2025-12-04T12:33:42.7150237Z Traceback (most recent call last): 2025-12-04T12:33:42.7150487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:42.7150739Z self._join_processes(fn) 2025-12-04T12:33:42.7150988Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:42.7151256Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:42.7151525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:42.7151791Z raise RuntimeError(error) 2025-12-04T12:33:42.7151944Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:33:42.7152107Z Traceback (most recent call last): 2025-12-04T12:33:42.7152354Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7152596Z getattr(self, test_name)() 2025-12-04T12:33:42.7152831Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7153067Z fn() 2025-12-04T12:33:42.7153271Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7153529Z method(*args, **kwargs) 2025-12-04T12:33:42.7153757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7153991Z method(*args, **kwargs) 2025-12-04T12:33:42.7154212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7154457Z with policy(): 2025-12-04T12:33:42.7154675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7154912Z raise RuntimeError(msg) 2025-12-04T12:33:42.7155361Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880. 2025-12-04T12:33:42.7155778Z 2025-12-04T12:33:42.7155859Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7156233Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7156527Z 2025-12-04T12:33:42.7156623Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7156747Z 2025-12-04T12:33:42.7156751Z 2025-12-04T12:33:42.7156833Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:42.7157036Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:42.7157417Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-d1aea10b4f99f99c.xml - 2025-12-04T12:33:42.7157767Z =========================== short test summary info ============================ 2025-12-04T12:33:42.7158147Z FAILED [9.0192s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:33:42.7158504Z Traceback (most recent call last): 2025-12-04T12:33:42.7158751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7159021Z getattr(self, test_name)() 2025-12-04T12:33:42.7159259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7159498Z fn() 2025-12-04T12:33:42.7159734Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7159968Z method(*args, **kwargs) 2025-12-04T12:33:42.7160191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7160425Z method(*args, **kwargs) 2025-12-04T12:33:42.7160648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7160877Z with policy(): 2025-12-04T12:33:42.7161093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7161331Z raise RuntimeError(msg) 2025-12-04T12:33:42.7161775Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880. 2025-12-04T12:33:42.7162183Z 2025-12-04T12:33:42.7162258Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7162650Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7162947Z 2025-12-04T12:33:42.7163036Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7163244Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:42.7163412Z ======================= 1 failed, 16 deselected in 9.03s ======================= 2025-12-04T12:33:42.7163552Z Got exit code 1 2025-12-04T12:33:42.7163650Z Retrying single test... 2025-12-04T12:33:42.7163928Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-0e6d6998b6f1fe1a.xml 2025-12-04T12:33:42.7164233Z ============================= test session starts ============================== 2025-12-04T12:33:42.7164445Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:33:42.7164639Z cachedir: .pytest_cache 2025-12-04T12:33:42.7164867Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:42.7165109Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:42.7165231Z configfile: pytest.ini 2025-12-04T12:33:42.7165462Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:42.7166002Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py) 2025-12-04T12:33:42.7166413Z class TestModel(nn.Module): 2025-12-04T12:33:42.7166542Z collected 17 items / 16 deselected / 1 selected 2025-12-04T12:33:42.7166888Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7167221Z Running 1 items in this shard 2025-12-04T12:33:42.7167297Z 2025-12-04T12:33:42.7167670Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:33:30.614000 326505 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 326574 2025-12-04T12:33:42.7168210Z I1204 12:33:30.615000 326505 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 326575 2025-12-04T12:33:42.7168554Z I1204 12:33:30.616000 326505 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 326576 2025-12-04T12:33:42.7168896Z I1204 12:33:30.616000 326505 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 326577 2025-12-04T12:33:42.7169360Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7169784Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7170413Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.7171003Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.7171377Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7171780Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7172387Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.7172984Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.7173353Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7173741Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7174131Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7174525Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7174921Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7175310Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7175706Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7176090Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7176483Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7176871Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7177266Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7177677Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7178059Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7178443Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7179052Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.7179667Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.7180033Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7180415Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7180804Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7181205Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7181597Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7181981Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7182391Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7182773Z model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7183376Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:33:42.7183958Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:33:42.7184399Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7184780Z model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7185161Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7185541Z model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs) 2025-12-04T12:33:42.7185930Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:33:42.7186317Z model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs) 2025-12-04T12:33:42.7187750Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7189193Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7190642Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7192058Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7193489Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7194906Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7196329Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:33:42.7197731Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:33:42.7198033Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7198375Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7198864Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7199346Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7199877Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7200328Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7200772Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7201256Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7201721Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7202198Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7202661Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7203114Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7203573Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7204041Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7204739Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664. 2025-12-04T12:33:42.7205388Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7205741Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7206387Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7206922Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7207285Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7207700Z [rank2]:E1204 12:33:37.627000 326576 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:33:42.7207943Z dist init r=2, world=4 2025-12-04T12:33:42.7208147Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7208484Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7208973Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7209455Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7210009Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7210475Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7210919Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7211399Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7211862Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7212327Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7212793Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7213245Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7213701Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7214166Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7214855Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880. 2025-12-04T12:33:42.7215514Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7215901Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7216525Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7217055Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7217425Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7217844Z [rank1]:E1204 12:33:37.632000 326575 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:33:42.7218089Z dist init r=1, world=4 2025-12-04T12:33:42.7218297Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7218637Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7219125Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7219649Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7220127Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7220592Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7221032Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7221501Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7221966Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7222437Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7222912Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7223370Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7223826Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7224296Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7225021Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:33:42.7225671Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7226021Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7226640Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7227175Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7227543Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7227956Z [rank3]:E1204 12:33:37.675000 326577 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:33:42.7228196Z dist init r=3, world=4 2025-12-04T12:33:42.7228398Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:33:42.7228738Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:33:42.7229242Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7229767Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:33:42.7230260Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7230708Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:33:42.7231151Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7231616Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7232078Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7232547Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:33:42.7233013Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7233469Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:33:42.7233926Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7234392Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:33:42.7235118Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216. 2025-12-04T12:33:42.7235769Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7236121Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7236741Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7237273Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:33:42.7237640Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7238054Z [rank0]:E1204 12:33:37.755000 326574 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:33:42.7238311Z dist init r=0, world=4 2025-12-04T12:33:42.7238713Z [rank0]:[W1204 12:33:38.990647884 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:33:42.7239134Z FAILED [8.9225s] [100%] 2025-12-04T12:33:42.7239201Z 2025-12-04T12:33:42.7239260Z =================================== FAILURES =================================== 2025-12-04T12:33:42.7239476Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _ 2025-12-04T12:33:42.7239721Z Traceback (most recent call last): 2025-12-04T12:33:42.7239968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:33:42.7240216Z self._join_processes(fn) 2025-12-04T12:33:42.7240465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:33:42.7240736Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:33:42.7241012Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:33:42.7241274Z raise RuntimeError(error) 2025-12-04T12:33:42.7241430Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:33:42.7241599Z Traceback (most recent call last): 2025-12-04T12:33:42.7241844Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7242089Z getattr(self, test_name)() 2025-12-04T12:33:42.7242327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7242563Z fn() 2025-12-04T12:33:42.7242770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7243004Z method(*args, **kwargs) 2025-12-04T12:33:42.7243229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7243462Z method(*args, **kwargs) 2025-12-04T12:33:42.7243714Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7243946Z with policy(): 2025-12-04T12:33:42.7244163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7244401Z raise RuntimeError(msg) 2025-12-04T12:33:42.7244844Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:33:42.7245258Z 2025-12-04T12:33:42.7245334Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7245712Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7246010Z 2025-12-04T12:33:42.7246106Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7246234Z 2025-12-04T12:33:42.7246235Z 2025-12-04T12:33:42.7246313Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:33:42.7246517Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:33:42.7246902Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-0e6d6998b6f1fe1a.xml - 2025-12-04T12:33:42.7247276Z =========================== short test summary info ============================ 2025-12-04T12:33:42.7247654Z FAILED [8.9225s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:33:42.7248027Z Traceback (most recent call last): 2025-12-04T12:33:42.7248278Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:33:42.7248527Z getattr(self, test_name)() 2025-12-04T12:33:42.7248765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:33:42.7249002Z fn() 2025-12-04T12:33:42.7249206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7249440Z method(*args, **kwargs) 2025-12-04T12:33:42.7249692Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:33:42.7249926Z method(*args, **kwargs) 2025-12-04T12:33:42.7250147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:33:42.7250377Z with policy(): 2025-12-04T12:33:42.7250593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:33:42.7250826Z raise RuntimeError(msg) 2025-12-04T12:33:42.7251276Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016. 2025-12-04T12:33:42.7251687Z 2025-12-04T12:33:42.7251765Z To execute this test, run the following from the base repo dir: 2025-12-04T12:33:42.7252139Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7252432Z 2025-12-04T12:33:42.7252526Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:33:42.7252751Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:33:42.7252921Z ======================= 1 failed, 16 deselected in 8.93s ======================= 2025-12-04T12:33:42.7253061Z Got exit code 1 2025-12-04T12:33:42.7253333Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda 2025-12-04T12:33:42.7253707Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:33:42.7254088Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-9f1e911ddb2b895a.xml 2025-12-04T12:33:42.7254393Z ============================= test session starts ============================== 2025-12-04T12:33:42.7254605Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:33:42.7254798Z cachedir: .pytest_cache 2025-12-04T12:33:42.7255031Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:33:42.7255271Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:33:42.7255391Z configfile: pytest.ini 2025-12-04T12:33:42.7255619Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:33:42.7256156Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py) 2025-12-04T12:33:42.7256588Z class TestModel(nn.Module): 2025-12-04T12:33:42.7256718Z collected 17 items / 17 deselected / 0 selected 2025-12-04T12:33:42.7256861Z stepcurrent: skipping 17 already run items. 2025-12-04T12:33:42.7257014Z Running 0 items in this shard 2025-12-04T12:33:42.7257085Z 2025-12-04T12:33:42.7261336Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-9f1e911ddb2b895a.xml - 2025-12-04T12:33:42.7261706Z ============================ 17 deselected in 0.01s ============================ 2025-12-04T12:33:42.7262053Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda'] 2025-12-04T12:33:42.7262329Z 2025-12-04T12:33:42.7262535Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_checkpoint 1/1 (test/test-reports/distributed.fsdp.test_fsdp_checkpoint_1.1_c244fb9a4f737098_.log) 2025-12-04T12:33:42.7262782Z 2025-12-04T12:33:42.7262917Z Finished distributed/fsdp/test_fsdp_checkpoint 1/1 ... [2025-12-04 12:33:42.690361][5229663.669400142], took 2.82min 2025-12-04T12:33:42.7263371Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:33:42.7263761Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:33:42.7263979Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:33:42.7264156Z Uploading artifacts took 0.00 seconds 2025-12-04T12:33:42.7264299Z distributed/fsdp/test_fsdp_checkpoint 1/1 failed! 2025-12-04T12:33:42.7264505Z Running distributed/fsdp/test_fsdp_fine_tune 1/1 ... [2025-12-04 12:33:42.693128][5229663.672169396] 2025-12-04T12:33:42.7264707Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:33:42.7265111Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_fine_tune.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:33:42.693318] 2025-12-04T12:36:07.5448974Z 2025-12-04T12:36:07.5452015Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_fine_tune 1/1 (test/test-reports/distributed.fsdp.test_fsdp_fine_tune_1.1_aed87725c804591d_.log) 2025-12-04T12:36:07.5453073Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d189db2d31ea5eb7.xml 2025-12-04T12:36:07.5453783Z ============================= test session starts ============================== 2025-12-04T12:36:07.5454359Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5454814Z cachedir: .pytest_cache 2025-12-04T12:36:07.5455389Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5455913Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5456168Z configfile: pytest.ini 2025-12-04T12:36:07.5456668Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5457191Z collecting ... collected 4 items 2025-12-04T12:36:07.5457485Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:36:07.5458964Z Running 4 items in this shard: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5460465Z 2025-12-04T12:36:07.5461099Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:33:44.536000 326975 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 327044 2025-12-04T12:36:07.5462229Z I1204 12:33:44.537000 326975 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 327045 2025-12-04T12:36:07.5463734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5464940Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5465839Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5466720Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5467296Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5467850Z return func(*args, **kwargs) 2025-12-04T12:36:07.5468380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5468918Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5469463Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5470041Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5470615Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5471125Z seq = FSDP( 2025-12-04T12:36:07.5471601Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5472138Z seq = FSDP( 2025-12-04T12:36:07.5474127Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5475895Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5477565Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5479198Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5479546Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5479978Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5480551Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5481102Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5481656Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5482169Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5482715Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5483253Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5483787Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5484323Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5484813Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5485265Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5485719Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5486180Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5486836Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5487449Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5487794Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5488374Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5488865Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5489237Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5489698Z [rank0]:E1204 12:33:51.622000 327044 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5489941Z dist init r=0, world=2 2025-12-04T12:36:07.5490143Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5490478Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5490962Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5491438Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5491954Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5492398Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5492831Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5493292Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5493756Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5494215Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5494682Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5495133Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5495589Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5496084Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5496722Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:36:07.5497335Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5497681Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5498247Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5498725Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5499086Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5499493Z [rank1]:E1204 12:33:51.625000 327045 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5499807Z dist init r=1, world=2 2025-12-04T12:36:07.5500220Z [rank0]:[W1204 12:33:51.800387324 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5500626Z FAILED [8.9174s] [ 25%] 2025-12-04T12:36:07.5500690Z 2025-12-04T12:36:07.5500747Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5500935Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________ 2025-12-04T12:36:07.5501109Z Traceback (most recent call last): 2025-12-04T12:36:07.5501384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5501627Z self._join_processes(fn) 2025-12-04T12:36:07.5501871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5502134Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5502399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5502658Z raise RuntimeError(error) 2025-12-04T12:36:07.5502808Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5502967Z Traceback (most recent call last): 2025-12-04T12:36:07.5503206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5503450Z getattr(self, test_name)() 2025-12-04T12:36:07.5503681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5503913Z fn() 2025-12-04T12:36:07.5504115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5504346Z method(*args, **kwargs) 2025-12-04T12:36:07.5504570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5504815Z method(*args, **kwargs) 2025-12-04T12:36:07.5505031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5505255Z with policy(): 2025-12-04T12:36:07.5505471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5505726Z raise RuntimeError(msg) 2025-12-04T12:36:07.5506122Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5506494Z 2025-12-04T12:36:07.5506567Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5506885Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5507127Z 2025-12-04T12:36:07.5507216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5507339Z 2025-12-04T12:36:07.5507341Z 2025-12-04T12:36:07.5507421Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5507622Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5507991Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d189db2d31ea5eb7.xml - 2025-12-04T12:36:07.5508329Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5508657Z FAILED [8.9174s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5509221Z Traceback (most recent call last): 2025-12-04T12:36:07.5509463Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5509752Z getattr(self, test_name)() 2025-12-04T12:36:07.5509981Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5510246Z fn() 2025-12-04T12:36:07.5510446Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5510671Z method(*args, **kwargs) 2025-12-04T12:36:07.5510887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5511114Z method(*args, **kwargs) 2025-12-04T12:36:07.5511328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5511550Z with policy(): 2025-12-04T12:36:07.5511761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5511989Z raise RuntimeError(msg) 2025-12-04T12:36:07.5512385Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5512745Z 2025-12-04T12:36:07.5512819Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5513137Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5513380Z 2025-12-04T12:36:07.5513484Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5513670Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5513825Z ============================== 1 failed in 8.93s =============================== 2025-12-04T12:36:07.5513954Z Got exit code 1 2025-12-04T12:36:07.5514048Z Retrying single test... 2025-12-04T12:36:07.5514329Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-29338384b66251d1.xml 2025-12-04T12:36:07.5514620Z ============================= test session starts ============================== 2025-12-04T12:36:07.5514829Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5515015Z cachedir: .pytest_cache 2025-12-04T12:36:07.5515240Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5515481Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5515597Z configfile: pytest.ini 2025-12-04T12:36:07.5515822Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5516088Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5516394Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5516668Z Running 1 items in this shard 2025-12-04T12:36:07.5516743Z 2025-12-04T12:36:07.5517035Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:33:55.923000 327211 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 327280 2025-12-04T12:36:07.5517511Z I1204 12:33:55.924000 327211 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 327281 2025-12-04T12:36:07.5518201Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5518813Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5519394Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5520022Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5520409Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5520777Z return func(*args, **kwargs) 2025-12-04T12:36:07.5521131Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5521488Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5521840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5522196Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5522539Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5522893Z seq = FSDP( 2025-12-04T12:36:07.5523206Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5523555Z seq = FSDP( 2025-12-04T12:36:07.5524878Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5526290Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5527767Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5529164Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5529464Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5529849Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5530339Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5530822Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5531301Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5531745Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5532184Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5532661Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5533123Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5533628Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5534086Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5534532Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5534982Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5535441Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5536085Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5536686Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5537032Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5537595Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5538100Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5538462Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5538872Z [rank0]:E1204 12:34:02.947000 327280 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5539113Z dist init r=0, world=2 2025-12-04T12:36:07.5539313Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5539686Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5540170Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5540647Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5541121Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5541579Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5542012Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5542491Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5542950Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5543408Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5543868Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5544315Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5544767Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5545230Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5545869Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:36:07.5546468Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5546815Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5547409Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5547887Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5548249Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5548660Z [rank1]:E1204 12:34:02.951000 327281 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5548897Z dist init r=1, world=2 2025-12-04T12:36:07.5549296Z [rank0]:[W1204 12:34:03.121269341 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5549743Z FAILED [8.9169s] [100%] 2025-12-04T12:36:07.5549807Z 2025-12-04T12:36:07.5549864Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5550053Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________ 2025-12-04T12:36:07.5550228Z Traceback (most recent call last): 2025-12-04T12:36:07.5550471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5550734Z self._join_processes(fn) 2025-12-04T12:36:07.5550978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5551244Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5551528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5551785Z raise RuntimeError(error) 2025-12-04T12:36:07.5551934Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5552091Z Traceback (most recent call last): 2025-12-04T12:36:07.5552327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5552568Z getattr(self, test_name)() 2025-12-04T12:36:07.5552799Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5553029Z fn() 2025-12-04T12:36:07.5553227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5553457Z method(*args, **kwargs) 2025-12-04T12:36:07.5553679Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5553907Z method(*args, **kwargs) 2025-12-04T12:36:07.5554124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5554348Z with policy(): 2025-12-04T12:36:07.5554561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5554789Z raise RuntimeError(msg) 2025-12-04T12:36:07.5555179Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5555541Z 2025-12-04T12:36:07.5555616Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5555970Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5556213Z 2025-12-04T12:36:07.5556303Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5556426Z 2025-12-04T12:36:07.5556485Z Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5556624Z Traceback (most recent call last): 2025-12-04T12:36:07.5556869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5557110Z getattr(self, test_name)() 2025-12-04T12:36:07.5557342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5557574Z fn() 2025-12-04T12:36:07.5557773Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5558004Z method(*args, **kwargs) 2025-12-04T12:36:07.5558219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5558445Z method(*args, **kwargs) 2025-12-04T12:36:07.5558659Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5558882Z with policy(): 2025-12-04T12:36:07.5559092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5559334Z raise RuntimeError(msg) 2025-12-04T12:36:07.5559766Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:36:07.5560136Z 2025-12-04T12:36:07.5560216Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5560538Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5560776Z 2025-12-04T12:36:07.5560864Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5560989Z 2025-12-04T12:36:07.5560991Z 2025-12-04T12:36:07.5561067Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5561266Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5561632Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-29338384b66251d1.xml - 2025-12-04T12:36:07.5561969Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5562296Z FAILED [8.9169s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5562602Z Traceback (most recent call last): 2025-12-04T12:36:07.5562843Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5563084Z getattr(self, test_name)() 2025-12-04T12:36:07.5563314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5563544Z fn() 2025-12-04T12:36:07.5563742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5563970Z method(*args, **kwargs) 2025-12-04T12:36:07.5564192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5564456Z method(*args, **kwargs) 2025-12-04T12:36:07.5564673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5564896Z with policy(): 2025-12-04T12:36:07.5565106Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5565335Z raise RuntimeError(msg) 2025-12-04T12:36:07.5565727Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5566089Z 2025-12-04T12:36:07.5566162Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5566482Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5566723Z 2025-12-04T12:36:07.5566809Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5566934Z 2025-12-04T12:36:07.5566991Z Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5567128Z Traceback (most recent call last): 2025-12-04T12:36:07.5567367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5567626Z getattr(self, test_name)() 2025-12-04T12:36:07.5567857Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5568086Z fn() 2025-12-04T12:36:07.5568287Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5568528Z method(*args, **kwargs) 2025-12-04T12:36:07.5568746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5568971Z method(*args, **kwargs) 2025-12-04T12:36:07.5569186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5569409Z with policy(): 2025-12-04T12:36:07.5569662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5569893Z raise RuntimeError(msg) 2025-12-04T12:36:07.5570282Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:36:07.5570644Z 2025-12-04T12:36:07.5570718Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5571035Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5571274Z 2025-12-04T12:36:07.5571362Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5571547Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5571711Z ======================= 1 failed, 3 deselected in 8.93s ======================== 2025-12-04T12:36:07.5571849Z Got exit code 1 2025-12-04T12:36:07.5571943Z Retrying single test... 2025-12-04T12:36:07.5572207Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d614fbc43046ce78.xml 2025-12-04T12:36:07.5572499Z ============================= test session starts ============================== 2025-12-04T12:36:07.5572743Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5572932Z cachedir: .pytest_cache 2025-12-04T12:36:07.5573155Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5573391Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5573508Z configfile: pytest.ini 2025-12-04T12:36:07.5573733Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5573999Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5574304Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5574578Z Running 1 items in this shard 2025-12-04T12:36:07.5574649Z 2025-12-04T12:36:07.5574948Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:34:07.242000 327447 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 327516 2025-12-04T12:36:07.5575422Z I1204 12:34:07.242000 327447 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 327517 2025-12-04T12:36:07.5576108Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5576711Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5577293Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5577887Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5578273Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5578641Z return func(*args, **kwargs) 2025-12-04T12:36:07.5578995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5579351Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5579750Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5580106Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5580448Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5580784Z seq = FSDP( 2025-12-04T12:36:07.5581103Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5581436Z seq = FSDP( 2025-12-04T12:36:07.5582788Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5584191Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5585603Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5587028Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5587329Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5587667Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5588157Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5588638Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5589114Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5589562Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5590043Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5590506Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5590970Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5591438Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5591924Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5592372Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5592822Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5593284Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5593932Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:36:07.5594530Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5594875Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5595437Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5595947Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5596311Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5596749Z [rank1]:E1204 12:34:14.327000 327517 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5596988Z dist init r=1, world=2 2025-12-04T12:36:07.5597187Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5597518Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5598000Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5598474Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5598952Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5599396Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5599875Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5600335Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5600833Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5601294Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5601752Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5602199Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5602649Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5603111Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5603750Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5604345Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5604706Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5605268Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5605762Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5606120Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5606529Z [rank0]:E1204 12:34:14.328000 327516 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5606767Z dist init r=0, world=2 2025-12-04T12:36:07.5607163Z [rank0]:[W1204 12:34:14.503720322 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5607567Z FAILED [8.9156s] [100%] 2025-12-04T12:36:07.5607633Z 2025-12-04T12:36:07.5607690Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5607875Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________ 2025-12-04T12:36:07.5608045Z Traceback (most recent call last): 2025-12-04T12:36:07.5608287Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5608529Z self._join_processes(fn) 2025-12-04T12:36:07.5608777Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5609040Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5609306Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5609563Z raise RuntimeError(error) 2025-12-04T12:36:07.5609769Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5609969Z Traceback (most recent call last): 2025-12-04T12:36:07.5610214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5610462Z getattr(self, test_name)() 2025-12-04T12:36:07.5610701Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5610937Z fn() 2025-12-04T12:36:07.5611143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5611379Z method(*args, **kwargs) 2025-12-04T12:36:07.5611604Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5611838Z method(*args, **kwargs) 2025-12-04T12:36:07.5612062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5612295Z with policy(): 2025-12-04T12:36:07.5612516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5612754Z raise RuntimeError(msg) 2025-12-04T12:36:07.5613156Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5613534Z 2025-12-04T12:36:07.5613611Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5613944Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5614194Z 2025-12-04T12:36:07.5614299Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5614426Z 2025-12-04T12:36:07.5614489Z Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5614633Z Traceback (most recent call last): 2025-12-04T12:36:07.5614877Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5615124Z getattr(self, test_name)() 2025-12-04T12:36:07.5615364Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5615602Z fn() 2025-12-04T12:36:07.5615807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5616043Z method(*args, **kwargs) 2025-12-04T12:36:07.5616266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5616501Z method(*args, **kwargs) 2025-12-04T12:36:07.5616723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5616955Z with policy(): 2025-12-04T12:36:07.5617171Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5617408Z raise RuntimeError(msg) 2025-12-04T12:36:07.5617802Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:36:07.5618168Z 2025-12-04T12:36:07.5618248Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5618569Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5618813Z 2025-12-04T12:36:07.5618931Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5619056Z 2025-12-04T12:36:07.5619057Z 2025-12-04T12:36:07.5619139Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5619342Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5619752Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d614fbc43046ce78.xml - 2025-12-04T12:36:07.5620097Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5620433Z FAILED [8.9156s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5620747Z Traceback (most recent call last): 2025-12-04T12:36:07.5621001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5621251Z getattr(self, test_name)() 2025-12-04T12:36:07.5621487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5621725Z fn() 2025-12-04T12:36:07.5621929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5622188Z method(*args, **kwargs) 2025-12-04T12:36:07.5622411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5622646Z method(*args, **kwargs) 2025-12-04T12:36:07.5622869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5623121Z with policy(): 2025-12-04T12:36:07.5623341Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5623576Z raise RuntimeError(msg) 2025-12-04T12:36:07.5623975Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576. 2025-12-04T12:36:07.5624338Z 2025-12-04T12:36:07.5624414Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5624739Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5624985Z 2025-12-04T12:36:07.5625073Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5625204Z 2025-12-04T12:36:07.5625265Z Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5625409Z Traceback (most recent call last): 2025-12-04T12:36:07.5625654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5625902Z getattr(self, test_name)() 2025-12-04T12:36:07.5626139Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5626376Z fn() 2025-12-04T12:36:07.5626584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5626818Z method(*args, **kwargs) 2025-12-04T12:36:07.5627040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5627273Z method(*args, **kwargs) 2025-12-04T12:36:07.5627532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5627764Z with policy(): 2025-12-04T12:36:07.5627979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5628215Z raise RuntimeError(msg) 2025-12-04T12:36:07.5628612Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328. 2025-12-04T12:36:07.5628978Z 2025-12-04T12:36:07.5629053Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5629375Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5629656Z 2025-12-04T12:36:07.5629747Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5629941Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5630109Z ======================= 1 failed, 3 deselected in 8.93s ======================== 2025-12-04T12:36:07.5630249Z Got exit code 1 2025-12-04T12:36:07.5630468Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda 2025-12-04T12:36:07.5630789Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:36:07.5631183Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-a791063d57910165.xml 2025-12-04T12:36:07.5631483Z ============================= test session starts ============================== 2025-12-04T12:36:07.5631713Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5631912Z cachedir: .pytest_cache 2025-12-04T12:36:07.5632141Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5632384Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5632509Z configfile: pytest.ini 2025-12-04T12:36:07.5632740Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5633016Z collecting ... collected 4 items / 1 deselected / 3 selected 2025-12-04T12:36:07.5633179Z stepcurrent: skipping 1 already run items. 2025-12-04T12:36:07.5633314Z Running 3 items in this shard 2025-12-04T12:36:07.5633392Z 2025-12-04T12:36:07.5633682Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:34:18.513000 327683 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 327752 2025-12-04T12:36:07.5634168Z I1204 12:34:18.514000 327683 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 327753 2025-12-04T12:36:07.5634858Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5635445Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5636052Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5636637Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5637026Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5637397Z return func(*args, **kwargs) 2025-12-04T12:36:07.5637755Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5638117Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5638472Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5638835Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5639192Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5639535Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5639900Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5640262Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5641625Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5643044Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5644455Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5645862Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5646171Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5646543Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5647034Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5647516Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5647997Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5648446Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5648897Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5649363Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5649882Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5650363Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5650828Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5651300Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5651764Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5652232Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5652880Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30720 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:36:07.5653485Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5653837Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5654415Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5654905Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5655273Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5655723Z [rank1]:E1204 12:34:26.868000 327753 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5655970Z dist init r=1, world=2 2025-12-04T12:36:07.5656176Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5656519Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5657012Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5657496Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5657981Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5658437Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5658880Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5659361Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5659876Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5660359Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5660827Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5661282Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5661746Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5662214Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5662857Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:36:07.5663459Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5663811Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5664383Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5664872Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5665273Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5665690Z [rank0]:E1204 12:34:26.883000 327752 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5665937Z dist init r=0, world=2 2025-12-04T12:36:07.5666342Z [rank0]:[W1204 12:34:27.082945502 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5666755Z FAILED [10.2179s] [ 33%] 2025-12-04T12:36:07.5666822Z 2025-12-04T12:36:07.5666884Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5667074Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________ 2025-12-04T12:36:07.5667252Z Traceback (most recent call last): 2025-12-04T12:36:07.5667499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5667749Z self._join_processes(fn) 2025-12-04T12:36:07.5667999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5668268Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5668556Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5668820Z raise RuntimeError(error) 2025-12-04T12:36:07.5668976Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5669143Z Traceback (most recent call last): 2025-12-04T12:36:07.5678218Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5678496Z getattr(self, test_name)() 2025-12-04T12:36:07.5678737Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5678972Z fn() 2025-12-04T12:36:07.5679178Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5679413Z method(*args, **kwargs) 2025-12-04T12:36:07.5679682Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5679915Z method(*args, **kwargs) 2025-12-04T12:36:07.5680135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5680362Z with policy(): 2025-12-04T12:36:07.5680579Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5680813Z raise RuntimeError(msg) 2025-12-04T12:36:07.5681210Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30720 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:36:07.5681574Z 2025-12-04T12:36:07.5681653Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5681972Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5682218Z 2025-12-04T12:36:07.5682310Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5682437Z 2025-12-04T12:36:07.5682440Z 2025-12-04T12:36:07.5682523Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5682786Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5683159Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-a791063d57910165.xml - 2025-12-04T12:36:07.5683501Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5683835Z FAILED [10.2179s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5684146Z Traceback (most recent call last): 2025-12-04T12:36:07.5684391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5684638Z getattr(self, test_name)() 2025-12-04T12:36:07.5684877Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5685111Z fn() 2025-12-04T12:36:07.5685314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5685546Z method(*args, **kwargs) 2025-12-04T12:36:07.5685765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5685994Z method(*args, **kwargs) 2025-12-04T12:36:07.5686229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5686456Z with policy(): 2025-12-04T12:36:07.5686671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5686904Z raise RuntimeError(msg) 2025-12-04T12:36:07.5687322Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30720 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:36:07.5687684Z 2025-12-04T12:36:07.5687760Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5688083Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5688329Z 2025-12-04T12:36:07.5688416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5688606Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5688775Z ======================= 1 failed, 1 deselected in 10.23s ======================= 2025-12-04T12:36:07.5688913Z Got exit code 1 2025-12-04T12:36:07.5689015Z Retrying single test... 2025-12-04T12:36:07.5689287Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d4cf9f3cb356e0a3.xml 2025-12-04T12:36:07.5689636Z ============================= test session starts ============================== 2025-12-04T12:36:07.5689852Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5690046Z cachedir: .pytest_cache 2025-12-04T12:36:07.5690271Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5690515Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5690635Z configfile: pytest.ini 2025-12-04T12:36:07.5690865Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5691135Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5691478Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5691756Z Running 1 items in this shard 2025-12-04T12:36:07.5691831Z 2025-12-04T12:36:07.5692126Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:34:31.236000 327919 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 327988 2025-12-04T12:36:07.5692602Z I1204 12:34:31.237000 327919 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 327989 2025-12-04T12:36:07.5693290Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5693879Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5694460Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5695065Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5695453Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5695823Z return func(*args, **kwargs) 2025-12-04T12:36:07.5696210Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5696568Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5696923Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5697279Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5697627Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5697970Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5698290Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5698630Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5700047Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5701469Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5702878Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5704281Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5704587Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5704928Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5705448Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5705942Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5706421Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5706871Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5707313Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5707778Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5708245Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5708706Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5709170Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5709659Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5710114Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5710609Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5711251Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:36:07.5711858Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5712211Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5712780Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5713262Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5713622Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5714034Z [rank0]:E1204 12:34:39.756000 327988 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5714300Z dist init r=0, world=2 2025-12-04T12:36:07.5714502Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5714838Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5715340Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5715821Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5716298Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5716744Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5717180Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5717649Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5718112Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5718588Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5719054Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5719507Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5720032Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5720501Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5721145Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:36:07.5721748Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5722096Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5722668Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5723147Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5723512Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5723940Z [rank1]:E1204 12:34:39.759000 327989 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5724182Z dist init r=1, world=2 2025-12-04T12:36:07.5724597Z [rank0]:[W1204 12:34:39.936644883 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5725004Z FAILED [10.3183s] [100%] 2025-12-04T12:36:07.5725071Z 2025-12-04T12:36:07.5725132Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5725319Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________ 2025-12-04T12:36:07.5725493Z Traceback (most recent call last): 2025-12-04T12:36:07.5725738Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5725982Z self._join_processes(fn) 2025-12-04T12:36:07.5726227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5726492Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5726756Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5727014Z raise RuntimeError(error) 2025-12-04T12:36:07.5727166Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5727329Z Traceback (most recent call last): 2025-12-04T12:36:07.5727570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5727815Z getattr(self, test_name)() 2025-12-04T12:36:07.5728045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5728279Z fn() 2025-12-04T12:36:07.5728484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5728715Z method(*args, **kwargs) 2025-12-04T12:36:07.5728961Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5729194Z method(*args, **kwargs) 2025-12-04T12:36:07.5729409Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5729674Z with policy(): 2025-12-04T12:36:07.5729887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5730121Z raise RuntimeError(msg) 2025-12-04T12:36:07.5730518Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:36:07.5730880Z 2025-12-04T12:36:07.5730954Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5731274Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5731513Z 2025-12-04T12:36:07.5731599Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5731724Z 2025-12-04T12:36:07.5731725Z 2025-12-04T12:36:07.5731800Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5732012Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5732385Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d4cf9f3cb356e0a3.xml - 2025-12-04T12:36:07.5732725Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5733068Z FAILED [10.3183s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5733372Z Traceback (most recent call last): 2025-12-04T12:36:07.5733614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5733853Z getattr(self, test_name)() 2025-12-04T12:36:07.5734081Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5734311Z fn() 2025-12-04T12:36:07.5734509Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5734735Z method(*args, **kwargs) 2025-12-04T12:36:07.5734951Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5735177Z method(*args, **kwargs) 2025-12-04T12:36:07.5735390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5735613Z with policy(): 2025-12-04T12:36:07.5735822Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5736049Z raise RuntimeError(msg) 2025-12-04T12:36:07.5736440Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:36:07.5736800Z 2025-12-04T12:36:07.5736873Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5737192Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5737433Z 2025-12-04T12:36:07.5737552Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5737740Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5737899Z ======================= 1 failed, 3 deselected in 10.33s ======================= 2025-12-04T12:36:07.5738031Z Got exit code 1 2025-12-04T12:36:07.5738122Z Retrying single test... 2025-12-04T12:36:07.5738385Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-e01ef7e7dcdba8b3.xml 2025-12-04T12:36:07.5738680Z ============================= test session starts ============================== 2025-12-04T12:36:07.5738886Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5739070Z cachedir: .pytest_cache 2025-12-04T12:36:07.5739295Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5739530Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5739684Z configfile: pytest.ini 2025-12-04T12:36:07.5739906Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5740171Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5740473Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5740763Z Running 1 items in this shard 2025-12-04T12:36:07.5740833Z 2025-12-04T12:36:07.5741122Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:34:44.057000 328155 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 328224 2025-12-04T12:36:07.5741604Z I1204 12:34:44.058000 328155 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 328225 2025-12-04T12:36:07.5742279Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5742858Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5743436Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5744012Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5744395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5744755Z return func(*args, **kwargs) 2025-12-04T12:36:07.5745105Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5745456Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5745807Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5746162Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5746537Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5746876Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5747198Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5747531Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5748845Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5750301Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5751710Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5753121Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5753420Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5753758Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5754241Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5754714Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5755188Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5755628Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5756097Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5756554Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5757016Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5757475Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5757935Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5758383Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5758833Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5759300Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5759986Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30720 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:36:07.5760600Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5760941Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5761501Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5761979Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5762337Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5762748Z [rank0]:E1204 12:34:52.567000 328224 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5762988Z dist init r=0, world=2 2025-12-04T12:36:07.5763185Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5763514Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5763992Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5764465Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5764960Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5765403Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5765835Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5766292Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5766749Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5767201Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5767659Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5768105Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5768553Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5769022Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5769693Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30720 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632. 2025-12-04T12:36:07.5770298Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5770642Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5771205Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5771680Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5772044Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5772454Z [rank1]:E1204 12:34:52.569000 328225 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5772691Z dist init r=1, world=2 2025-12-04T12:36:07.5773084Z [rank0]:[W1204 12:34:52.753575413 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5773488Z FAILED [10.3183s] [100%] 2025-12-04T12:36:07.5773551Z 2025-12-04T12:36:07.5773608Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5773789Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________ 2025-12-04T12:36:07.5773958Z Traceback (most recent call last): 2025-12-04T12:36:07.5774225Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5774465Z self._join_processes(fn) 2025-12-04T12:36:07.5774706Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5774964Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5775228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5775486Z raise RuntimeError(error) 2025-12-04T12:36:07.5775631Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5775787Z Traceback (most recent call last): 2025-12-04T12:36:07.5776020Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5776262Z getattr(self, test_name)() 2025-12-04T12:36:07.5776490Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5776717Z fn() 2025-12-04T12:36:07.5776915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5777142Z method(*args, **kwargs) 2025-12-04T12:36:07.5777360Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5777601Z method(*args, **kwargs) 2025-12-04T12:36:07.5777815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5778037Z with policy(): 2025-12-04T12:36:07.5778246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5778491Z raise RuntimeError(msg) 2025-12-04T12:36:07.5778882Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30720 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:36:07.5779241Z 2025-12-04T12:36:07.5779314Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5779662Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5779903Z 2025-12-04T12:36:07.5779989Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5780113Z 2025-12-04T12:36:07.5780115Z 2025-12-04T12:36:07.5780191Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5780387Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5780759Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-e01ef7e7dcdba8b3.xml - 2025-12-04T12:36:07.5781094Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5781416Z FAILED [10.3183s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5781718Z Traceback (most recent call last): 2025-12-04T12:36:07.5781956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5782200Z getattr(self, test_name)() 2025-12-04T12:36:07.5782427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5782721Z fn() 2025-12-04T12:36:07.5782921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5783147Z method(*args, **kwargs) 2025-12-04T12:36:07.5783363Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5783591Z method(*args, **kwargs) 2025-12-04T12:36:07.5783803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5784027Z with policy(): 2025-12-04T12:36:07.5784236Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5784465Z raise RuntimeError(msg) 2025-12-04T12:36:07.5784855Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30720 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880. 2025-12-04T12:36:07.5785217Z 2025-12-04T12:36:07.5785291Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5785602Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5785841Z 2025-12-04T12:36:07.5785947Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5786128Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5786285Z ======================= 1 failed, 3 deselected in 10.33s ======================= 2025-12-04T12:36:07.5786419Z Got exit code 1 2025-12-04T12:36:07.5786627Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda 2025-12-04T12:36:07.5786953Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:36:07.5787312Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-9f197ebd66670f53.xml 2025-12-04T12:36:07.5787599Z ============================= test session starts ============================== 2025-12-04T12:36:07.5787803Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5787988Z cachedir: .pytest_cache 2025-12-04T12:36:07.5788206Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5788439Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5788551Z configfile: pytest.ini 2025-12-04T12:36:07.5788772Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5789039Z collecting ... collected 4 items / 2 deselected / 2 selected 2025-12-04T12:36:07.5789193Z stepcurrent: skipping 2 already run items. 2025-12-04T12:36:07.5789318Z Running 2 items in this shard 2025-12-04T12:36:07.5789389Z 2025-12-04T12:36:07.5789711Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:34:56.791000 328391 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 328460 2025-12-04T12:36:07.5790171Z I1204 12:34:56.791000 328391 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 328461 2025-12-04T12:36:07.5790882Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5791466Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5792046Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5792619Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5793004Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5793365Z return func(*args, **kwargs) 2025-12-04T12:36:07.5793715Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5794070Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5794420Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5794787Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5795125Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5795462Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5795785Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5796131Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5797444Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5798850Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5800317Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5801711Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5802009Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5802344Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5802825Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5803299Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5803769Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5804208Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5804658Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5805116Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5805599Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5806056Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5806510Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5806958Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5807407Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5807866Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5808493Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5809078Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5809419Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5810049Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5810520Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5810878Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5811290Z [rank0]:E1204 12:35:03.835000 328460 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5811528Z dist init r=0, world=2 2025-12-04T12:36:07.5811726Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5812058Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5812540Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5813014Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5813488Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5813945Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5814377Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5814856Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5815313Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5815767Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5816226Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5816673Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5817127Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5817590Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5818223Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:36:07.5818810Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5819175Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5819770Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5820233Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5820592Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5820999Z [rank1]:E1204 12:35:03.839000 328461 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5821235Z dist init r=1, world=2 2025-12-04T12:36:07.5821632Z [rank0]:[W1204 12:35:04.005838154 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5822035Z FAILED [8.9183s] [ 50%] 2025-12-04T12:36:07.5822097Z 2025-12-04T12:36:07.5822152Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5822331Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________ 2025-12-04T12:36:07.5822511Z Traceback (most recent call last): 2025-12-04T12:36:07.5822750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5822989Z self._join_processes(fn) 2025-12-04T12:36:07.5823232Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5823506Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5823770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5824026Z raise RuntimeError(error) 2025-12-04T12:36:07.5824171Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5824328Z Traceback (most recent call last): 2025-12-04T12:36:07.5824561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5824798Z getattr(self, test_name)() 2025-12-04T12:36:07.5825027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5825253Z fn() 2025-12-04T12:36:07.5825452Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5825678Z method(*args, **kwargs) 2025-12-04T12:36:07.5825896Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5826122Z method(*args, **kwargs) 2025-12-04T12:36:07.5826337Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5826558Z with policy(): 2025-12-04T12:36:07.5826766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5826994Z raise RuntimeError(msg) 2025-12-04T12:36:07.5827374Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5827723Z 2025-12-04T12:36:07.5827795Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5828132Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5828362Z 2025-12-04T12:36:07.5828448Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5828571Z 2025-12-04T12:36:07.5828573Z 2025-12-04T12:36:07.5828648Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5828842Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5829206Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-9f197ebd66670f53.xml - 2025-12-04T12:36:07.5829541Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5829884Z FAILED [8.9183s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5830174Z Traceback (most recent call last): 2025-12-04T12:36:07.5830413Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5830651Z getattr(self, test_name)() 2025-12-04T12:36:07.5830880Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5831124Z fn() 2025-12-04T12:36:07.5831320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5831544Z method(*args, **kwargs) 2025-12-04T12:36:07.5831760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5832006Z method(*args, **kwargs) 2025-12-04T12:36:07.5832221Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5832443Z with policy(): 2025-12-04T12:36:07.5832649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5832877Z raise RuntimeError(msg) 2025-12-04T12:36:07.5833263Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5833613Z 2025-12-04T12:36:07.5833686Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5833990Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5834220Z 2025-12-04T12:36:07.5834307Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5834490Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5834649Z ======================= 1 failed, 2 deselected in 8.93s ======================== 2025-12-04T12:36:07.5834781Z Got exit code 1 2025-12-04T12:36:07.5834873Z Retrying single test... 2025-12-04T12:36:07.5835134Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-e469e64fd235e843.xml 2025-12-04T12:36:07.5835425Z ============================= test session starts ============================== 2025-12-04T12:36:07.5835630Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5835815Z cachedir: .pytest_cache 2025-12-04T12:36:07.5836064Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5836299Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5836413Z configfile: pytest.ini 2025-12-04T12:36:07.5836634Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5836896Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5837188Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda 2025-12-04T12:36:07.5837450Z Running 1 items in this shard 2025-12-04T12:36:07.5837521Z 2025-12-04T12:36:07.5837797Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:35:08.090000 328627 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 328696 2025-12-04T12:36:07.5838260Z I1204 12:35:08.090000 328627 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 328697 2025-12-04T12:36:07.5838938Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5839514Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5840141Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5840725Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5841109Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5841476Z return func(*args, **kwargs) 2025-12-04T12:36:07.5841822Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5842177Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5842526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5842879Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5843218Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5843557Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5843876Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5844210Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5845568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5846970Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5848379Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5849857Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5850153Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5850505Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5850988Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5851459Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5851930Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5852372Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5852808Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5853264Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5853723Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5853870Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5854144Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5854307Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5854588Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5854734Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5855181Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5855298Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5855493Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5855817Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5855943Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5856153Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5856316Z [rank0]:E1204 12:35:15.131000 328696 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5856369Z dist init r=0, world=2 2025-12-04T12:36:07.5856506Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5856666Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5856954Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5857107Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5857393Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5857517Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5857792Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5857936Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5858211Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5858356Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5858650Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5858787Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5859065Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5859212Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5859700Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17920 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:36:07.5859814Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5860008Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5860327Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5860455Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5860679Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5860842Z [rank1]:E1204 12:35:15.132000 328697 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5860880Z dist init r=1, world=2 2025-12-04T12:36:07.5861216Z [rank0]:[W1204 12:35:15.300778733 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5861256Z FAILED [8.8153s] [100%] 2025-12-04T12:36:07.5861258Z 2025-12-04T12:36:07.5861314Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5861402Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________ 2025-12-04T12:36:07.5861448Z Traceback (most recent call last): 2025-12-04T12:36:07.5861611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5861654Z self._join_processes(fn) 2025-12-04T12:36:07.5861826Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5861878Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5862055Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5862099Z raise RuntimeError(error) 2025-12-04T12:36:07.5862176Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5862221Z Traceback (most recent call last): 2025-12-04T12:36:07.5862381Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5862424Z getattr(self, test_name)() 2025-12-04T12:36:07.5862604Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5862638Z fn() 2025-12-04T12:36:07.5862790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5862829Z method(*args, **kwargs) 2025-12-04T12:36:07.5862980Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5863019Z method(*args, **kwargs) 2025-12-04T12:36:07.5863169Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5863205Z with policy(): 2025-12-04T12:36:07.5863359Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5863402Z raise RuntimeError(msg) 2025-12-04T12:36:07.5863716Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5863719Z 2025-12-04T12:36:07.5863795Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5863993Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5864009Z 2025-12-04T12:36:07.5864098Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5864100Z 2025-12-04T12:36:07.5864102Z 2025-12-04T12:36:07.5864176Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5864274Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5864522Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-e469e64fd235e843.xml - 2025-12-04T12:36:07.5864583Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5864799Z FAILED [8.8153s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5864845Z Traceback (most recent call last): 2025-12-04T12:36:07.5865010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5865052Z getattr(self, test_name)() 2025-12-04T12:36:07.5865211Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5865246Z fn() 2025-12-04T12:36:07.5865399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5865440Z method(*args, **kwargs) 2025-12-04T12:36:07.5865591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5865629Z method(*args, **kwargs) 2025-12-04T12:36:07.5865779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5865816Z with policy(): 2025-12-04T12:36:07.5865969Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5866008Z raise RuntimeError(msg) 2025-12-04T12:36:07.5866352Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5866356Z 2025-12-04T12:36:07.5866430Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5866628Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5866631Z 2025-12-04T12:36:07.5866717Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5866780Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5866841Z ======================= 1 failed, 3 deselected in 8.82s ======================== 2025-12-04T12:36:07.5866878Z Got exit code 1 2025-12-04T12:36:07.5866919Z Retrying single test... 2025-12-04T12:36:07.5867118Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-9ce64f8147a99674.xml 2025-12-04T12:36:07.5867179Z ============================= test session starts ============================== 2025-12-04T12:36:07.5867290Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5867331Z cachedir: .pytest_cache 2025-12-04T12:36:07.5867487Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5867533Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5867585Z configfile: pytest.ini 2025-12-04T12:36:07.5867753Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5867823Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5868016Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda 2025-12-04T12:36:07.5868071Z Running 1 items in this shard 2025-12-04T12:36:07.5868073Z 2025-12-04T12:36:07.5868351Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:35:19.306000 328863 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 328932 2025-12-04T12:36:07.5868508Z I1204 12:35:19.306000 328863 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 328933 2025-12-04T12:36:07.5869002Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5869067Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5869556Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5869641Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5869932Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5869976Z return func(*args, **kwargs) 2025-12-04T12:36:07.5870260Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5870331Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5870609Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5870651Z return fsdp_fn(module, **kwargs) 2025-12-04T12:36:07.5870919Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5870957Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5871219Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T12:36:07.5871256Z fsdp_seq = FSDP( 2025-12-04T12:36:07.5872519Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5872676Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5873928Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:36:07.5874049Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:36:07.5874193Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5874353Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5874643Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5874797Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5875107Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5875233Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5875507Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5875655Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5875930Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5876079Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5876353Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5876489Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5876780Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5876927Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5877384Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17920 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5877498Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5877693Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5878015Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5878129Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5878338Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5878501Z [rank0]:E1204 12:35:26.315000 328932 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5878540Z dist init r=0, world=2 2025-12-04T12:36:07.5878676Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5878836Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5879120Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5879294Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5879614Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5879737Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5880015Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5880161Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5880438Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5880582Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5880857Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5881012Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5881286Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5881451Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5881890Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:36:07.5882006Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5882198Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5882524Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5882636Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5882844Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5883009Z [rank1]:E1204 12:35:26.321000 328933 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5883046Z dist init r=1, world=2 2025-12-04T12:36:07.5883380Z [rank0]:[W1204 12:35:26.473645116 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5883453Z FAILED [8.8173s] [100%] 2025-12-04T12:36:07.5883455Z 2025-12-04T12:36:07.5883512Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5883600Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________ 2025-12-04T12:36:07.5883648Z Traceback (most recent call last): 2025-12-04T12:36:07.5883809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5883855Z self._join_processes(fn) 2025-12-04T12:36:07.5884026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5884081Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5884259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5884304Z raise RuntimeError(error) 2025-12-04T12:36:07.5884384Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5884427Z Traceback (most recent call last): 2025-12-04T12:36:07.5884588Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5884629Z getattr(self, test_name)() 2025-12-04T12:36:07.5884787Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5884835Z fn() 2025-12-04T12:36:07.5884987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5885026Z method(*args, **kwargs) 2025-12-04T12:36:07.5885177Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5885227Z method(*args, **kwargs) 2025-12-04T12:36:07.5885378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5885415Z with policy(): 2025-12-04T12:36:07.5885567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5885607Z raise RuntimeError(msg) 2025-12-04T12:36:07.5885922Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17920 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5885925Z 2025-12-04T12:36:07.5885999Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5886197Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5886201Z 2025-12-04T12:36:07.5886289Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5886291Z 2025-12-04T12:36:07.5886348Z Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5886394Z Traceback (most recent call last): 2025-12-04T12:36:07.5886555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5886598Z getattr(self, test_name)() 2025-12-04T12:36:07.5886755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5886790Z fn() 2025-12-04T12:36:07.5886942Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5886981Z method(*args, **kwargs) 2025-12-04T12:36:07.5887152Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5887193Z method(*args, **kwargs) 2025-12-04T12:36:07.5887341Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5887379Z with policy(): 2025-12-04T12:36:07.5887531Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5887573Z raise RuntimeError(msg) 2025-12-04T12:36:07.5887887Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:36:07.5887889Z 2025-12-04T12:36:07.5887961Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5888158Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5888161Z 2025-12-04T12:36:07.5888246Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5888248Z 2025-12-04T12:36:07.5888250Z 2025-12-04T12:36:07.5888326Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5888411Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5888675Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-9ce64f8147a99674.xml - 2025-12-04T12:36:07.5888734Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5888949Z FAILED [8.8173s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5889005Z Traceback (most recent call last): 2025-12-04T12:36:07.5889171Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5889211Z getattr(self, test_name)() 2025-12-04T12:36:07.5889370Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5889404Z fn() 2025-12-04T12:36:07.5889555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5889674Z method(*args, **kwargs) 2025-12-04T12:36:07.5889828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5889869Z method(*args, **kwargs) 2025-12-04T12:36:07.5890021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5890059Z with policy(): 2025-12-04T12:36:07.5890209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5890251Z raise RuntimeError(msg) 2025-12-04T12:36:07.5890566Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17920 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512. 2025-12-04T12:36:07.5890569Z 2025-12-04T12:36:07.5890644Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5890841Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5890844Z 2025-12-04T12:36:07.5890959Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5890961Z 2025-12-04T12:36:07.5891020Z Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5891066Z Traceback (most recent call last): 2025-12-04T12:36:07.5891227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5891269Z getattr(self, test_name)() 2025-12-04T12:36:07.5891426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5891461Z fn() 2025-12-04T12:36:07.5891612Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5891650Z method(*args, **kwargs) 2025-12-04T12:36:07.5891800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5891840Z method(*args, **kwargs) 2025-12-04T12:36:07.5891990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5892027Z with policy(): 2025-12-04T12:36:07.5892179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5892219Z raise RuntimeError(msg) 2025-12-04T12:36:07.5892532Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264. 2025-12-04T12:36:07.5892549Z 2025-12-04T12:36:07.5892622Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5892817Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda 2025-12-04T12:36:07.5892833Z 2025-12-04T12:36:07.5892919Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5892986Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5893049Z ======================= 1 failed, 3 deselected in 8.83s ======================== 2025-12-04T12:36:07.5893086Z Got exit code 1 2025-12-04T12:36:07.5893235Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda 2025-12-04T12:36:07.5893362Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:36:07.5893561Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-0015e4972511b8d3.xml 2025-12-04T12:36:07.5893618Z ============================= test session starts ============================== 2025-12-04T12:36:07.5893734Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5893774Z cachedir: .pytest_cache 2025-12-04T12:36:07.5893933Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5893978Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5894019Z configfile: pytest.ini 2025-12-04T12:36:07.5894181Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5894254Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5894306Z stepcurrent: skipping 3 already run items. 2025-12-04T12:36:07.5894350Z Running 1 items in this shard 2025-12-04T12:36:07.5894352Z 2025-12-04T12:36:07.5894666Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:35:30.535000 329099 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 329168 2025-12-04T12:36:07.5894822Z I1204 12:35:30.536000 329099 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 329169 2025-12-04T12:36:07.5895319Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5895381Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5895870Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5895929Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5896223Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5896266Z return func(*args, **kwargs) 2025-12-04T12:36:07.5896421Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5896581Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5896871Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5897043Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5897325Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5897451Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5897731Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5897881Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5898155Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5898302Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5898575Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5898711Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5899007Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5899154Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5899646Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:36:07.5899765Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5899960Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5900305Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5900418Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5900628Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5900807Z [rank1]:E1204 12:35:38.382000 329169 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5900847Z dist init r=1, world=2 2025-12-04T12:36:07.5900983Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5901157Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5901442Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5901593Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5901877Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5902000Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5902281Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5902427Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5902703Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5902849Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5903124Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5903283Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5903559Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5903707Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5904167Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:36:07.5904285Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5904480Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5904820Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5904942Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5905151Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5905327Z [rank0]:E1204 12:35:38.428000 329168 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5905365Z dist init r=0, world=2 2025-12-04T12:36:07.5905700Z [rank0]:[W1204 12:35:38.710265873 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5905738Z FAILED [9.6167s] [100%] 2025-12-04T12:36:07.5905740Z 2025-12-04T12:36:07.5905796Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5905890Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________ 2025-12-04T12:36:07.5905937Z Traceback (most recent call last): 2025-12-04T12:36:07.5906099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5906143Z self._join_processes(fn) 2025-12-04T12:36:07.5906319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5906372Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5906550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5906592Z raise RuntimeError(error) 2025-12-04T12:36:07.5906671Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5906717Z Traceback (most recent call last): 2025-12-04T12:36:07.5906878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5906920Z getattr(self, test_name)() 2025-12-04T12:36:07.5907078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5907114Z fn() 2025-12-04T12:36:07.5907284Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5907324Z method(*args, **kwargs) 2025-12-04T12:36:07.5907475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5907514Z method(*args, **kwargs) 2025-12-04T12:36:07.5907665Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5907703Z with policy(): 2025-12-04T12:36:07.5907856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5907897Z raise RuntimeError(msg) 2025-12-04T12:36:07.5908231Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:36:07.5908235Z 2025-12-04T12:36:07.5908309Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5908524Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5908526Z 2025-12-04T12:36:07.5908614Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5908626Z 2025-12-04T12:36:07.5908628Z 2025-12-04T12:36:07.5908701Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5908789Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5909034Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-0015e4972511b8d3.xml - 2025-12-04T12:36:07.5909106Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5909334Z FAILED [9.6167s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5909380Z Traceback (most recent call last): 2025-12-04T12:36:07.5909544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5909628Z getattr(self, test_name)() 2025-12-04T12:36:07.5909787Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5909821Z fn() 2025-12-04T12:36:07.5909972Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5910012Z method(*args, **kwargs) 2025-12-04T12:36:07.5910165Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5910204Z method(*args, **kwargs) 2025-12-04T12:36:07.5910355Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5910391Z with policy(): 2025-12-04T12:36:07.5910543Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5910583Z raise RuntimeError(msg) 2025-12-04T12:36:07.5910913Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:36:07.5910917Z 2025-12-04T12:36:07.5911013Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5911231Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5911233Z 2025-12-04T12:36:07.5911319Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5911382Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5911444Z ======================= 1 failed, 3 deselected in 9.63s ======================== 2025-12-04T12:36:07.5911484Z Got exit code 1 2025-12-04T12:36:07.5911525Z Retrying single test... 2025-12-04T12:36:07.5911727Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-0e0648c6f1ca5aaf.xml 2025-12-04T12:36:07.5911785Z ============================= test session starts ============================== 2025-12-04T12:36:07.5911898Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5911941Z cachedir: .pytest_cache 2025-12-04T12:36:07.5912098Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5912145Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5912185Z configfile: pytest.ini 2025-12-04T12:36:07.5912347Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5912436Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5912646Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5912689Z Running 1 items in this shard 2025-12-04T12:36:07.5912703Z 2025-12-04T12:36:07.5912995Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:35:42.687000 329335 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 329404 2025-12-04T12:36:07.5913148Z I1204 12:35:42.688000 329335 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 329405 2025-12-04T12:36:07.5913644Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5916312Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5916816Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5916879Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5917172Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5917219Z return func(*args, **kwargs) 2025-12-04T12:36:07.5917363Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5917527Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5917854Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5918010Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5918296Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5918422Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5918699Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5918850Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5919127Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5919274Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5919563Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5919741Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5920042Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5920192Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5920651Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:36:07.5920769Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5920964Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5921310Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5921424Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5921637Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5921802Z [rank0]:E1204 12:35:50.629000 329404 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5921842Z dist init r=0, world=2 2025-12-04T12:36:07.5922006Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5922165Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5922452Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5922605Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5922889Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5923016Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5923297Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5923443Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5923721Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5923881Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5924157Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5924304Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5924582Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5924730Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5925186Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:36:07.5925299Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5925493Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5925831Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5925944Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5926153Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5926336Z [rank1]:E1204 12:35:50.636000 329405 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5926376Z dist init r=1, world=2 2025-12-04T12:36:07.5926714Z [rank0]:[W1204 12:35:50.800091060 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5926755Z FAILED [9.7170s] [100%] 2025-12-04T12:36:07.5926757Z 2025-12-04T12:36:07.5926811Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5926907Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________ 2025-12-04T12:36:07.5926953Z Traceback (most recent call last): 2025-12-04T12:36:07.5927118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5927163Z self._join_processes(fn) 2025-12-04T12:36:07.5927336Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5927388Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5927567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5927622Z raise RuntimeError(error) 2025-12-04T12:36:07.5927701Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5927745Z Traceback (most recent call last): 2025-12-04T12:36:07.5927906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5927947Z getattr(self, test_name)() 2025-12-04T12:36:07.5928117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5928151Z fn() 2025-12-04T12:36:07.5928304Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5928346Z method(*args, **kwargs) 2025-12-04T12:36:07.5928496Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5928536Z method(*args, **kwargs) 2025-12-04T12:36:07.5928686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5928723Z with policy(): 2025-12-04T12:36:07.5928876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5928918Z raise RuntimeError(msg) 2025-12-04T12:36:07.5929251Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:36:07.5929254Z 2025-12-04T12:36:07.5929329Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5929544Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5929548Z 2025-12-04T12:36:07.5929678Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5929680Z 2025-12-04T12:36:07.5929682Z 2025-12-04T12:36:07.5929759Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5929845Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5930124Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-0e0648c6f1ca5aaf.xml - 2025-12-04T12:36:07.5930184Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5930418Z FAILED [9.7170s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:36:07.5930462Z Traceback (most recent call last): 2025-12-04T12:36:07.5930627Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5930669Z getattr(self, test_name)() 2025-12-04T12:36:07.5930830Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5930864Z fn() 2025-12-04T12:36:07.5931019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5931058Z method(*args, **kwargs) 2025-12-04T12:36:07.5931208Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5931246Z method(*args, **kwargs) 2025-12-04T12:36:07.5931398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5931449Z with policy(): 2025-12-04T12:36:07.5931602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5931641Z raise RuntimeError(msg) 2025-12-04T12:36:07.5931972Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:36:07.5931989Z 2025-12-04T12:36:07.5932064Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5932278Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5932280Z 2025-12-04T12:36:07.5932365Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5932427Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5932490Z ======================= 1 failed, 3 deselected in 9.73s ======================== 2025-12-04T12:36:07.5932527Z Got exit code 1 2025-12-04T12:36:07.5932566Z Retrying single test... 2025-12-04T12:36:07.5932770Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-5ade9bc37ba34ad9.xml 2025-12-04T12:36:07.5932831Z ============================= test session starts ============================== 2025-12-04T12:36:07.5932943Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5932984Z cachedir: .pytest_cache 2025-12-04T12:36:07.5933140Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5933187Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5933227Z configfile: pytest.ini 2025-12-04T12:36:07.5933393Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5933463Z collecting ... collected 4 items / 3 deselected / 1 selected 2025-12-04T12:36:07.5933673Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5933719Z Running 1 items in this shard 2025-12-04T12:36:07.5933741Z 2025-12-04T12:36:07.5934029Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:35:54.799000 329571 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 329640 2025-12-04T12:36:07.5934182Z I1204 12:35:54.800000 329571 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 329641 2025-12-04T12:36:07.5934684Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5934747Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5935232Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T12:36:07.5935293Z device_from_device_id = _get_device_from_device_id( 2025-12-04T12:36:07.5935588Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T12:36:07.5935642Z return func(*args, **kwargs) 2025-12-04T12:36:07.5935786Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5935959Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5936248Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5936403Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5936689Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5936814Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5937094Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5937242Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5937517Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5937666Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5937943Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5938100Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5938377Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5938524Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5938980Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:36:07.5939095Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5939293Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5939670Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5939797Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5940007Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5940171Z [rank1]:E1204 12:36:02.720000 329641 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:36:07.5940228Z dist init r=1, world=2 2025-12-04T12:36:07.5940363Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:36:07.5940522Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:36:07.5940808Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5940962Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:36:07.5941242Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5941367Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:36:07.5941646Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5941791Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5942066Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5942211Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:36:07.5942514Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5942650Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:36:07.5942926Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5943073Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:36:07.5943527Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384. 2025-12-04T12:36:07.5943644Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5943839Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5944189Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5944301Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:36:07.5944524Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5944689Z [rank0]:E1204 12:36:02.781000 329640 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:36:07.5944727Z dist init r=0, world=2 2025-12-04T12:36:07.5945062Z [rank0]:[W1204 12:36:03.056022483 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T12:36:07.5945101Z FAILED [9.7173s] [100%] 2025-12-04T12:36:07.5945103Z 2025-12-04T12:36:07.5945159Z =================================== FAILURES =================================== 2025-12-04T12:36:07.5945251Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________ 2025-12-04T12:36:07.5945299Z Traceback (most recent call last): 2025-12-04T12:36:07.5945463Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:36:07.5945507Z self._join_processes(fn) 2025-12-04T12:36:07.5945678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:36:07.5945732Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:36:07.5945910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:36:07.5945955Z raise RuntimeError(error) 2025-12-04T12:36:07.5946032Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5946077Z Traceback (most recent call last): 2025-12-04T12:36:07.5946238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5946311Z getattr(self, test_name)() 2025-12-04T12:36:07.5946469Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5946504Z fn() 2025-12-04T12:36:07.5946655Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5946696Z method(*args, **kwargs) 2025-12-04T12:36:07.5946847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5946887Z method(*args, **kwargs) 2025-12-04T12:36:07.5947039Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5947075Z with policy(): 2025-12-04T12:36:07.5947227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5947269Z raise RuntimeError(msg) 2025-12-04T12:36:07.5947597Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:36:07.5947599Z 2025-12-04T12:36:07.5947672Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5947886Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5947898Z 2025-12-04T12:36:07.5947984Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5947986Z 2025-12-04T12:36:07.5947989Z 2025-12-04T12:36:07.5948063Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:36:07.5948161Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:36:07.5948409Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-5ade9bc37ba34ad9.xml - 2025-12-04T12:36:07.5948469Z =========================== short test summary info ============================ 2025-12-04T12:36:07.5948698Z FAILED [9.7173s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:36:07.5948744Z Traceback (most recent call last): 2025-12-04T12:36:07.5948908Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:36:07.5948951Z getattr(self, test_name)() 2025-12-04T12:36:07.5949112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:36:07.5949149Z fn() 2025-12-04T12:36:07.5949300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5949340Z method(*args, **kwargs) 2025-12-04T12:36:07.5949489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:36:07.5949529Z method(*args, **kwargs) 2025-12-04T12:36:07.5949710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:36:07.5949748Z with policy(): 2025-12-04T12:36:07.5949899Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:36:07.5949940Z raise RuntimeError(msg) 2025-12-04T12:36:07.5950301Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136. 2025-12-04T12:36:07.5950306Z 2025-12-04T12:36:07.5950378Z To execute this test, run the following from the base repo dir: 2025-12-04T12:36:07.5950592Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5950594Z 2025-12-04T12:36:07.5950682Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:36:07.5950744Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:36:07.5950804Z ======================= 1 failed, 3 deselected in 9.73s ======================== 2025-12-04T12:36:07.5950841Z Got exit code 1 2025-12-04T12:36:07.5951005Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda 2025-12-04T12:36:07.5951135Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:36:07.5951338Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-43708990400729c0.xml 2025-12-04T12:36:07.5951395Z ============================= test session starts ============================== 2025-12-04T12:36:07.5951508Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:36:07.5951562Z cachedir: .pytest_cache 2025-12-04T12:36:07.5951720Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:36:07.5951765Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:36:07.5951806Z configfile: pytest.ini 2025-12-04T12:36:07.5951982Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:36:07.5952054Z collecting ... collected 4 items / 4 deselected / 0 selected 2025-12-04T12:36:07.5952106Z stepcurrent: skipping 4 already run items. 2025-12-04T12:36:07.5952148Z Running 0 items in this shard 2025-12-04T12:36:07.5952150Z 2025-12-04T12:36:07.5952391Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-43708990400729c0.xml - 2025-12-04T12:36:07.5952450Z ============================ 4 deselected in 0.00s ============================= 2025-12-04T12:36:07.5953052Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda'] 2025-12-04T12:36:07.5953055Z 2025-12-04T12:36:07.5953246Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_fine_tune 1/1 (test/test-reports/distributed.fsdp.test_fsdp_fine_tune_1.1_aed87725c804591d_.log) 2025-12-04T12:36:07.5953248Z 2025-12-04T12:36:07.5953376Z Finished distributed/fsdp/test_fsdp_fine_tune 1/1 ... [2025-12-04 12:36:07.546114][5229808.525152349], took 2.41min 2025-12-04T12:36:07.5953641Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:36:07.5953729Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:36:07.5953821Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:36:07.5953871Z Uploading artifacts took 0.00 seconds 2025-12-04T12:36:07.5953928Z distributed/fsdp/test_fsdp_fine_tune 1/1 failed! 2025-12-04T12:36:07.5954061Z Running distributed/test_multi_threaded_pg 1/1 ... [2025-12-04 12:36:07.549081][5229808.52812229] 2025-12-04T12:36:07.5954110Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:36:07.5954425Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_multi_threaded_pg.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:36:07.549264] 2025-12-04T12:36:10.0172331Z 2025-12-04T12:36:10.0172708Z distributed/test_multi_threaded_pg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_multi_threaded_pg_1.1_063d4919acdf1ad9_.log 2025-12-04T12:36:10.0176490Z Running 22 items in this shard: test/distributed/test_multi_threaded_pg.py::TestCollectivesWithWrapper::test_all_to_all_single_list, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithWrapper::test_all_to_all_single_none, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithWrapper::test_all_to_all_single_tensor, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithWrapper::test_broadcast_object_list, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithWrapper::test_collective_error_on_rank_non_zero, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithWrapper::test_collective_error_on_rank_non_zero_all, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithWrapper::test_collective_error_on_rank_zero, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithWrapper::test_skip, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_all_reduce, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_all_reduce_coalesced, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_all_reduce_ops, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_all_to_all, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_allgather, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_assert_equal_on_rank, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_broadcast, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_broadcast_object_list, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_bwd_sees_fwd_pg, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_gather, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_reduce_scatter, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_scatter, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_subpg, test/distributed/test_multi_threaded_pg.py::TestCollectivesWithBaseClass::test_using_pg_from_another_thread 2025-12-04T12:36:10.0180481Z 2025-12-04T12:36:10.0180630Z Finished distributed/test_multi_threaded_pg 1/1 ... [2025-12-04 12:36:10.016880][5229810.995918539], took 0.04min 2025-12-04T12:36:10.0184483Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:36:10.0196221Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:36:10.0197714Z Running distributed/_composable/fsdp/test_fully_shard_extensions 1/1 ... [2025-12-04 12:36:10.019629][5229810.998670123] 2025-12-04T12:36:10.0198305Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:36:10.0199947Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_extensions.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:36:10.019822] 2025-12-04T12:36:36.1719324Z 2025-12-04T12:36:36.1729770Z distributed/_composable/fsdp/test_fully_shard_extensions 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_extensions_1.1_d99f22c17891004c_.log 2025-12-04T12:36:36.1733254Z Running 5 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_extensions.py::TestFullyShardAllGatherExtensionsMultiProcess::test_all_gather_extensions_train_parity, test/distributed/_composable/fsdp/test_fully_shard_extensions.py::TestFullyShardAllGatherExtensionsMultiThread::test_all_gather_extension_hsdp_mesh, test/distributed/_composable/fsdp/test_fully_shard_extensions.py::TestFullyShardAllGatherExtensionsMultiThread::test_all_gather_extension_outer_size_stride, test/distributed/_composable/fsdp/test_fully_shard_extensions.py::TestFullyShardAllGatherExtensionsMultiThread::test_all_gather_extensions_end_to_end, test/distributed/_composable/fsdp/test_fully_shard_extensions.py::TestFullyShardAllGatherExtensionsMultiThread::test_all_gather_extensions_monkey_patch 2025-12-04T12:36:36.1735426Z 2025-12-04T12:36:36.1735696Z Finished distributed/_composable/fsdp/test_fully_shard_extensions 1/1 ... [2025-12-04 12:36:36.171610][5229837.150646935], took 0.44min 2025-12-04T12:36:36.1739732Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:36:36.1752401Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:36:36.1753028Z Running distributed/checkpoint/test_file_system_checkpoint_cpu 1/1 ... [2025-12-04 12:36:36.175167][5229837.154208038] 2025-12-04T12:36:36.1754654Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:36:36.1755214Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_file_system_checkpoint_cpu.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:36:36.175364] 2025-12-04T12:36:58.5729909Z 2025-12-04T12:36:58.5731054Z distributed/checkpoint/test_file_system_checkpoint_cpu 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_file_system_checkpoint_cpu_1.1_1a588f4f72c9e6bd_.log 2025-12-04T12:36:58.5739757Z Running 16 items in this shard: test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedStateDictSaveLoad::test_read_write_only_tensor_thread_count_1, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedStateDictSaveLoad::test_read_write_only_tensor_thread_count_2, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedStateDictSaveLoadRot13::test_read_write_tensor_and_blob_thread_count_1, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedStateDictSaveLoadRot13::test_read_write_tensor_and_blob_thread_count_2, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedStateDictSaveLoadZStandard::test_read_write_only_tensor_thread_count_1, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedStateDictSaveLoadZStandard::test_read_write_only_tensor_thread_count_2, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedStateDictSaveLoadWithSharedTensor::test_read_write_shard_tensor_thread_count_1, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedStateDictSaveLoadWithSharedTensor::test_read_write_shard_tensor_thread_count_2, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedReshardOnLoad::test_load_rowwise_to_colwise_thread_count_1, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedReshardOnLoad::test_load_rowwise_to_colwise_thread_count_2, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedReshardOnLoad::test_load_with_different_shard_plan_thread_count_1, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedReshardOnLoad::test_load_with_different_shard_plan_thread_count_2, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedReshardOnLoad::test_save_load_bytes_thread_count_1, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedReshardOnLoad::test_save_load_bytes_thread_count_2, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedReshardOnLoad::test_switch_between_sharded_tensor_to_tensor_thread_count_1, test/distributed/checkpoint/test_file_system_checkpoint_cpu.py::TestDistributedReshardOnLoad::test_switch_between_sharded_tensor_to_tensor_thread_count_2 2025-12-04T12:36:58.5746390Z 2025-12-04T12:36:58.5746636Z Finished distributed/checkpoint/test_file_system_checkpoint_cpu 1/1 ... [2025-12-04 12:36:58.572753][5229859.551788427], took 0.37min 2025-12-04T12:36:58.5750170Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:36:58.5760684Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:36:58.5762695Z Running distributed/fsdp/test_wrap 1/1 ... [2025-12-04 12:36:58.576185][5229859.555225602] 2025-12-04T12:36:58.5762925Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:36:58.5765121Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_wrap.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:36:58.576403] 2025-12-04T12:39:14.6569886Z 2025-12-04T12:39:14.6570996Z distributed/fsdp/test_wrap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_wrap_1.1_3aac12bd02055555_.log 2025-12-04T12:39:14.6588618Z Running 52 items in this shard: test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_bn_always_wrapped_individually, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_error_already_wrapped_nested_False_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_error_already_wrapped_nested_False_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_error_already_wrapped_nested_True_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_error_already_wrapped_nested_True_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload0_backward_prefetch0_forward_prefetch_False_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload0_backward_prefetch0_forward_prefetch_False_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload0_backward_prefetch0_forward_prefetch_True_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload0_backward_prefetch0_forward_prefetch_True_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload0_backward_prefetch1_forward_prefetch_False_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload0_backward_prefetch1_forward_prefetch_False_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload0_backward_prefetch1_forward_prefetch_True_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload0_backward_prefetch1_forward_prefetch_True_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload1_backward_prefetch0_forward_prefetch_False_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload1_backward_prefetch0_forward_prefetch_False_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload1_backward_prefetch0_forward_prefetch_True_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload1_backward_prefetch0_forward_prefetch_True_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload1_backward_prefetch1_forward_prefetch_False_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload1_backward_prefetch1_forward_prefetch_False_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload1_backward_prefetch1_forward_prefetch_True_device_init_mode0, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_main_wrap_api_cpu_offload1_backward_prefetch1_forward_prefetch_True_device_init_mode1, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_wrap_batchnorm_individually_use_or_policy_False, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_wrap_batchnorm_individually_use_or_policy_True, test/distributed/fsdp/test_wrap.py::TestFSDPWrap::test_zero_argument, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_always_wrap, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_always_wrap_with_ignored_modules_wrap_method0, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_always_wrap_with_ignored_modules_wrap_method1, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_api, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_preset_exclude_wrap, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_preset_exclude_wrap_include_children, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_preset_force_leaf, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_preset_force_leaf_custom, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_smoke_test_device_init_mode0_cpu_offload0_use_device_id_False, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_smoke_test_device_init_mode0_cpu_offload0_use_device_id_True, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_smoke_test_device_init_mode0_cpu_offload1_use_device_id_False, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_smoke_test_device_init_mode0_cpu_offload1_use_device_id_True, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_smoke_test_device_init_mode1_cpu_offload0_use_device_id_False, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_smoke_test_device_init_mode1_cpu_offload0_use_device_id_True, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_smoke_test_device_init_mode1_cpu_offload1_use_device_id_False, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_smoke_test_device_init_mode1_cpu_offload1_use_device_id_True, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_with_ignored_modules_wrap_method0, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_auto_wrap_with_ignored_modules_wrap_method1, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_custom_policy, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_frozen_params, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_module_wrap_policy, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_module_wrap_policy_callable, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_transformer_auto_wrap_policy, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_wrap_disabled_outside_context, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_wrap_override_defaults, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_wrap_wrap_method0, test/distributed/fsdp/test_wrap.py::TestAutoWrap::test_wrap_wrap_method1, test/distributed/fsdp/test_wrap.py::TestWrapUtils::test_validate_frozen_params 2025-12-04T12:39:14.6598511Z 2025-12-04T12:39:14.6598656Z Finished distributed/fsdp/test_wrap 1/1 ... [2025-12-04 12:39:14.656643][5229995.635680802], took 2.27min 2025-12-04T12:39:14.6599162Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:39:14.6599704Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:39:14.6600001Z Running distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 ... [2025-12-04 12:39:14.659390][5229995.638431366] 2025-12-04T12:39:14.6600248Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:39:14.6600729Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_hsdp_dtensor_state_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:39:14.659612] 2025-12-04T12:44:29.0254745Z 2025-12-04T12:44:29.0255851Z PRINTING LOG FILE of distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 (test/test-reports/distributed.fsdp.test_hsdp_dtensor_state_dict_1.1_60de516b7e1e2204_.log) 2025-12-04T12:44:29.0256912Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-cc2b69bb2c652278.xml 2025-12-04T12:44:29.0257685Z ============================= test session starts ============================== 2025-12-04T12:44:29.0258170Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0258579Z cachedir: .pytest_cache 2025-12-04T12:44:29.0259128Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0259829Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0260783Z configfile: pytest.ini 2025-12-04T12:44:29.0261301Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0261820Z collecting ... collected 8 items 2025-12-04T12:44:29.0262105Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T12:44:29.0265650Z Running 8 items in this shard: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda, test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.0269919Z 2025-12-04T12:44:29.0270648Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda I1204 12:39:16.427000 339794 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 339863 2025-12-04T12:44:29.0271644Z I1204 12:39:16.428000 339794 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 339864 2025-12-04T12:44:29.0272236Z I1204 12:39:16.429000 339794 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 339865 2025-12-04T12:44:29.0272829Z I1204 12:39:16.429000 339794 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 339866 2025-12-04T12:44:29.0274604Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0275594Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0276568Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0277541Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0278543Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0279569Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0280582Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0281567Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0283313Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0284831Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0286330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0287768Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0289289Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0290805Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0292242Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0293704Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0294002Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0294337Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0294819Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0295290Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0295757Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0296192Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0296657Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0297111Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0297565Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0298015Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0298467Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0298907Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0299350Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0299842Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0300547Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3242196992. 2025-12-04T12:44:29.0301231Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0301588Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0302243Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0302812Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0303169Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0303570Z E1204 12:39:25.182000 339863 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0303904Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0304229Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0304704Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0305170Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0305636Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0306072Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0306552Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0307005Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0307459Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0307909Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0308362Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0308806Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0309249Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0309753Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0310474Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1105199104 and is now 3076521984. 2025-12-04T12:44:29.0311156Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0311496Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0312153Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0312721Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0313074Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0313479Z E1204 12:39:25.193000 339866 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0313808Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0314149Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0314623Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0315090Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0315591Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0316022Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0316444Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0316889Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0317339Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0317783Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0318235Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0318670Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0319107Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0319612Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0320322Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0320997Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0321331Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0321977Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0322542Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0322895Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0323292Z E1204 12:39:25.210000 339865 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0323619Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0323944Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0324420Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0324884Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0325381Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0325813Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0326239Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0326686Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0327134Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0327582Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0328031Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0328468Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0328926Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0329376Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0330111Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0330769Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0331104Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0331752Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0332320Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0332668Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0333067Z E1204 12:39:25.231000 339864 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0333305Z FAILED [10.1200s] [ 12%] 2025-12-04T12:44:29.0333377Z 2025-12-04T12:44:29.0333436Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0333684Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.0333917Z Traceback (most recent call last): 2025-12-04T12:44:29.0334165Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0334448Z self._join_processes(fn) 2025-12-04T12:44:29.0334697Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0334963Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0335238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0335500Z raise RuntimeError(error) 2025-12-04T12:44:29.0335655Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0335818Z Traceback (most recent call last): 2025-12-04T12:44:29.0336059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0336306Z getattr(self, test_name)() 2025-12-04T12:44:29.0336544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0336778Z fn() 2025-12-04T12:44:29.0336983Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0337216Z method(*args, **kwargs) 2025-12-04T12:44:29.0337441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0337671Z method(*args, **kwargs) 2025-12-04T12:44:29.0337908Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0338138Z with policy(): 2025-12-04T12:44:29.0338360Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0338595Z raise RuntimeError(msg) 2025-12-04T12:44:29.0339088Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3242196992. 2025-12-04T12:44:29.0339522Z 2025-12-04T12:44:29.0339625Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0340041Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0340385Z 2025-12-04T12:44:29.0340476Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0340604Z 2025-12-04T12:44:29.0340605Z 2025-12-04T12:44:29.0340688Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0340894Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0341297Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-cc2b69bb2c652278.xml - 2025-12-04T12:44:29.0341664Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0342083Z FAILED [10.1200s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0342479Z Traceback (most recent call last): 2025-12-04T12:44:29.0342728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0342975Z getattr(self, test_name)() 2025-12-04T12:44:29.0343210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0343484Z fn() 2025-12-04T12:44:29.0343686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0343922Z method(*args, **kwargs) 2025-12-04T12:44:29.0344143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0344376Z method(*args, **kwargs) 2025-12-04T12:44:29.0344598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0344828Z with policy(): 2025-12-04T12:44:29.0345042Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0345275Z raise RuntimeError(msg) 2025-12-04T12:44:29.0345747Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3242196992. 2025-12-04T12:44:29.0346185Z 2025-12-04T12:44:29.0346260Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0346679Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0347041Z 2025-12-04T12:44:29.0347129Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0347318Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0347478Z ============================== 1 failed in 10.13s ============================== 2025-12-04T12:44:29.0347626Z Got exit code 1 2025-12-04T12:44:29.0347733Z Retrying single test... 2025-12-04T12:44:29.0348026Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-9152f00d6bb6fe13.xml 2025-12-04T12:44:29.0348347Z ============================= test session starts ============================== 2025-12-04T12:44:29.0348564Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0348756Z cachedir: .pytest_cache 2025-12-04T12:44:29.0348983Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0349225Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0349345Z configfile: pytest.ini 2025-12-04T12:44:29.0349613Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0349888Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.0350289Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0350661Z Running 1 items in this shard 2025-12-04T12:44:29.0350737Z 2025-12-04T12:44:29.0351111Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda I1204 12:39:29.322000 340264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 340333 2025-12-04T12:44:29.0351678Z I1204 12:39:29.323000 340264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 340334 2025-12-04T12:44:29.0352021Z I1204 12:39:29.323000 340264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 340335 2025-12-04T12:44:29.0352403Z I1204 12:39:29.324000 340264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 340336 2025-12-04T12:44:29.0353282Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0354032Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0354772Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0355515Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0356250Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0357011Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0357748Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0358499Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0359869Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0361299Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0362757Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0364158Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0365578Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0366998Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0368404Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0369854Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0370151Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0370479Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0370951Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0371416Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0371886Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0372347Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0372774Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0373223Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0373674Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0374122Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0374576Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0375013Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0375451Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0375899Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0376628Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0377301Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0377638Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0378288Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0378855Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0379205Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0379647Z E1204 12:39:37.959000 340334 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0379976Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0380299Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0380769Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0381233Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0381737Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0382168Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0382589Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0383038Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0383487Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0383938Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0384395Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0384829Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0385267Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0385731Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0386433Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3240099840. 2025-12-04T12:44:29.0387103Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0387437Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0388084Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0388651Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0389004Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0402635Z E1204 12:39:37.969000 340333 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0402997Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0403333Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0403840Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0404316Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0404865Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0405308Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0405740Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0406200Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0406659Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0407115Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0407569Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0408011Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0408482Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0408939Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0409699Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0410373Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0410718Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0411385Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0411971Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0412333Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0412741Z E1204 12:39:37.973000 340335 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0413077Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0413410Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0413895Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0414401Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0414870Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0415310Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0415740Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0416194Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0416665Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0417122Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0417576Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0418042Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0418492Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0418969Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0419725Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1256194048 and is now 3076521984. 2025-12-04T12:44:29.0420389Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0420730Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0421386Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0421953Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0422309Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0422714Z E1204 12:39:37.980000 340336 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0422955Z FAILED [9.9192s] [100%] 2025-12-04T12:44:29.0423029Z 2025-12-04T12:44:29.0423089Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0423343Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.0423618Z Traceback (most recent call last): 2025-12-04T12:44:29.0423877Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0424134Z self._join_processes(fn) 2025-12-04T12:44:29.0424389Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0424661Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0424937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0425204Z raise RuntimeError(error) 2025-12-04T12:44:29.0425361Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0425531Z Traceback (most recent call last): 2025-12-04T12:44:29.0425778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0426034Z getattr(self, test_name)() 2025-12-04T12:44:29.0426274Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0426512Z fn() 2025-12-04T12:44:29.0426722Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0426960Z method(*args, **kwargs) 2025-12-04T12:44:29.0427190Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0427448Z method(*args, **kwargs) 2025-12-04T12:44:29.0427675Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0427911Z with policy(): 2025-12-04T12:44:29.0428148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0428386Z raise RuntimeError(msg) 2025-12-04T12:44:29.0428860Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3240099840. 2025-12-04T12:44:29.0429298Z 2025-12-04T12:44:29.0429375Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0429838Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0430186Z 2025-12-04T12:44:29.0430282Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0430411Z 2025-12-04T12:44:29.0430479Z Process 1 exited with error code 10 and exception: 2025-12-04T12:44:29.0430629Z Traceback (most recent call last): 2025-12-04T12:44:29.0430881Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0431131Z getattr(self, test_name)() 2025-12-04T12:44:29.0431373Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0431612Z fn() 2025-12-04T12:44:29.0431825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0432064Z method(*args, **kwargs) 2025-12-04T12:44:29.0432291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0432530Z method(*args, **kwargs) 2025-12-04T12:44:29.0432792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0433028Z with policy(): 2025-12-04T12:44:29.0433253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0433494Z raise RuntimeError(msg) 2025-12-04T12:44:29.0433970Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0434400Z 2025-12-04T12:44:29.0434481Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0434906Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0435243Z 2025-12-04T12:44:29.0435339Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0435464Z 2025-12-04T12:44:29.0435466Z 2025-12-04T12:44:29.0435553Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0435762Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0436167Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-9152f00d6bb6fe13.xml - 2025-12-04T12:44:29.0436571Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0436990Z FAILED [9.9192s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0437408Z Traceback (most recent call last): 2025-12-04T12:44:29.0437662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0437914Z getattr(self, test_name)() 2025-12-04T12:44:29.0438154Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0438394Z fn() 2025-12-04T12:44:29.0438605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0438846Z method(*args, **kwargs) 2025-12-04T12:44:29.0439076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0439319Z method(*args, **kwargs) 2025-12-04T12:44:29.0439550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0439825Z with policy(): 2025-12-04T12:44:29.0440049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0440289Z raise RuntimeError(msg) 2025-12-04T12:44:29.0440768Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3240099840. 2025-12-04T12:44:29.0441210Z 2025-12-04T12:44:29.0441288Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0441711Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0442087Z 2025-12-04T12:44:29.0442180Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0442304Z 2025-12-04T12:44:29.0442366Z Process 1 exited with error code 10 and exception: 2025-12-04T12:44:29.0442507Z Traceback (most recent call last): 2025-12-04T12:44:29.0442756Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0443003Z getattr(self, test_name)() 2025-12-04T12:44:29.0443238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0443472Z fn() 2025-12-04T12:44:29.0443676Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0443907Z method(*args, **kwargs) 2025-12-04T12:44:29.0444131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0444361Z method(*args, **kwargs) 2025-12-04T12:44:29.0444580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0444806Z with policy(): 2025-12-04T12:44:29.0445019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0445273Z raise RuntimeError(msg) 2025-12-04T12:44:29.0445739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0446184Z 2025-12-04T12:44:29.0446260Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0446673Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0447012Z 2025-12-04T12:44:29.0447099Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0447289Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0447458Z ======================= 1 failed, 7 deselected in 9.93s ======================== 2025-12-04T12:44:29.0447600Z Got exit code 1 2025-12-04T12:44:29.0447700Z Retrying single test... 2025-12-04T12:44:29.0447998Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-2d3dc27ffee4c9ac.xml 2025-12-04T12:44:29.0448324Z ============================= test session starts ============================== 2025-12-04T12:44:29.0448541Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0448734Z cachedir: .pytest_cache 2025-12-04T12:44:29.0448960Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0449202Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0449326Z configfile: pytest.ini 2025-12-04T12:44:29.0449557Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0449874Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.0450277Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0450649Z Running 1 items in this shard 2025-12-04T12:44:29.0450758Z 2025-12-04T12:44:29.0451139Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda I1204 12:39:41.740000 340734 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 340803 2025-12-04T12:44:29.0451712Z I1204 12:39:41.740000 340734 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 340804 2025-12-04T12:44:29.0452060Z I1204 12:39:41.741000 340734 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 340805 2025-12-04T12:44:29.0452405Z I1204 12:39:41.741000 340734 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 340806 2025-12-04T12:44:29.0453293Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0454046Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0454781Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0455560Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0456295Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0457035Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0457765Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0458503Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0459908Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0461329Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0462745Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0464152Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0465584Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0467024Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0468447Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0469904Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0470201Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0470531Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0471052Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0471514Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0471979Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0472410Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0472832Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0473279Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0473721Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0474162Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0474626Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0475182Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0475624Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0476085Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0476783Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0477489Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0477822Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0478470Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0479030Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0479375Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0479812Z E1204 12:39:50.479000 340806 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0480137Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0480492Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0480959Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0481416Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0481878Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0482305Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0482732Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0483177Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0483618Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0484079Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0484522Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0484975Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0485410Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0485855Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0486546Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3240099840. 2025-12-04T12:44:29.0487198Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0487532Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0488176Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0488733Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0489076Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0489469Z E1204 12:39:50.486000 340803 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0489870Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0490187Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0490652Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0491112Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0491570Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0492005Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0492424Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0492866Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0493309Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0493766Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0494215Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0494659Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0495095Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0495540Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0496231Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0496889Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0497220Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0497864Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0498421Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0498823Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0499243Z E1204 12:39:50.521000 340804 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0499567Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0499926Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0500394Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0500855Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0501362Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0501793Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0502253Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0502700Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0503161Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0503636Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0504096Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0504562Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0504998Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0505447Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0506144Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0506799Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0507132Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0507783Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0508345Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0508741Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0509132Z E1204 12:39:50.528000 340805 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0509364Z FAILED [10.1199s] [100%] 2025-12-04T12:44:29.0509432Z 2025-12-04T12:44:29.0509488Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0509763Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.0509992Z Traceback (most recent call last): 2025-12-04T12:44:29.0510233Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0510473Z self._join_processes(fn) 2025-12-04T12:44:29.0510721Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0510984Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0511251Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0511507Z raise RuntimeError(error) 2025-12-04T12:44:29.0511655Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.0511813Z Traceback (most recent call last): 2025-12-04T12:44:29.0512072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0512313Z getattr(self, test_name)() 2025-12-04T12:44:29.0512543Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0512786Z fn() 2025-12-04T12:44:29.0512993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0513219Z method(*args, **kwargs) 2025-12-04T12:44:29.0513438Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0513668Z method(*args, **kwargs) 2025-12-04T12:44:29.0513884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0514110Z with policy(): 2025-12-04T12:44:29.0514319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0514546Z raise RuntimeError(msg) 2025-12-04T12:44:29.0515016Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0515446Z 2025-12-04T12:44:29.0515519Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0515928Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0516267Z 2025-12-04T12:44:29.0516357Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0516483Z 2025-12-04T12:44:29.0516485Z 2025-12-04T12:44:29.0516563Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0516762Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0517191Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-2d3dc27ffee4c9ac.xml - 2025-12-04T12:44:29.0517554Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0517962Z FAILED [10.1199s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.0518350Z Traceback (most recent call last): 2025-12-04T12:44:29.0518593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0518834Z getattr(self, test_name)() 2025-12-04T12:44:29.0519068Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0519297Z fn() 2025-12-04T12:44:29.0519499Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0519767Z method(*args, **kwargs) 2025-12-04T12:44:29.0519985Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0520210Z method(*args, **kwargs) 2025-12-04T12:44:29.0520425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0520668Z with policy(): 2025-12-04T12:44:29.0520877Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0521106Z raise RuntimeError(msg) 2025-12-04T12:44:29.0521572Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1268776960 and is now 3076521984. 2025-12-04T12:44:29.0522018Z 2025-12-04T12:44:29.0522091Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0522508Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0522847Z 2025-12-04T12:44:29.0522935Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0523120Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0523281Z ======================= 1 failed, 7 deselected in 10.13s ======================= 2025-12-04T12:44:29.0523416Z Got exit code 1 2025-12-04T12:44:29.0523724Z FAILED CONSISTENTLY: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0524131Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:44:29.0524515Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-7802117e87e0d3bb.xml 2025-12-04T12:44:29.0524832Z ============================= test session starts ============================== 2025-12-04T12:44:29.0525047Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0525233Z cachedir: .pytest_cache 2025-12-04T12:44:29.0525457Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0525693Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0525810Z configfile: pytest.ini 2025-12-04T12:44:29.0526071Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0526339Z collecting ... collected 8 items / 1 deselected / 7 selected 2025-12-04T12:44:29.0526495Z stepcurrent: skipping 1 already run items. 2025-12-04T12:44:29.0526622Z Running 7 items in this shard 2025-12-04T12:44:29.0526692Z 2025-12-04T12:44:29.0527066Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda I1204 12:39:54.365000 341204 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 341273 2025-12-04T12:44:29.0527624Z I1204 12:39:54.366000 341204 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 341274 2025-12-04T12:44:29.0527960Z I1204 12:39:54.366000 341204 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 341275 2025-12-04T12:44:29.0528299Z I1204 12:39:54.367000 341204 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 341276 2025-12-04T12:44:29.0529168Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0529956Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0530696Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0531450Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0532184Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0532920Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0533658Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0534403Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0535789Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0537198Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0538611Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0540044Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0541457Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0542881Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0544294Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0545703Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0546056Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0546389Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0546875Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0547342Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0547808Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0548244Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0548675Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0549129Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0549620Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0550091Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0550541Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0550992Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0551587Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0552034Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0552733Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3219128320. 2025-12-04T12:44:29.0553394Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0553726Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0554369Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0554929Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0555273Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0555712Z E1204 12:40:02.980000 341273 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0556036Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0556357Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0556827Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0557286Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0557744Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0558172Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0558595Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0559041Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0559503Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0560013Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0560476Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0560906Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0561340Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0561792Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0562482Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3053453312. 2025-12-04T12:44:29.0563139Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0563467Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0564111Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0564675Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0565053Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0565457Z E1204 12:40:03.001000 341275 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0565787Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0566113Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0566596Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0567062Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0567530Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0567966Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0568395Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0568862Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0569314Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0569990Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0570448Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0570892Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0571340Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0571794Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0572497Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 958398464 and is now 3053453312. 2025-12-04T12:44:29.0573155Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0573493Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0574231Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0574834Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0575180Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0575577Z E1204 12:40:03.029000 341276 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0575908Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0576236Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0576709Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0577174Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0577634Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0578069Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0578519Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0578971Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0579446Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0579938Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0580396Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0580840Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0581280Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0581733Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0582426Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3053453312. 2025-12-04T12:44:29.0583086Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0583424Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0584102Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0584667Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0585015Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0585410Z E1204 12:40:03.097000 341274 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0585652Z FAILED [10.0206s] [ 14%] 2025-12-04T12:44:29.0585725Z 2025-12-04T12:44:29.0585784Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0586031Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.0586271Z Traceback (most recent call last): 2025-12-04T12:44:29.0586523Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0586771Z self._join_processes(fn) 2025-12-04T12:44:29.0587025Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0587295Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0587566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0587848Z raise RuntimeError(error) 2025-12-04T12:44:29.0588000Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0588164Z Traceback (most recent call last): 2025-12-04T12:44:29.0588412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0588678Z getattr(self, test_name)() 2025-12-04T12:44:29.0588918Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0589155Z fn() 2025-12-04T12:44:29.0589367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0589636Z method(*args, **kwargs) 2025-12-04T12:44:29.0589861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0590100Z method(*args, **kwargs) 2025-12-04T12:44:29.0590325Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0590557Z with policy(): 2025-12-04T12:44:29.0590776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0591020Z raise RuntimeError(msg) 2025-12-04T12:44:29.0591491Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3219128320. 2025-12-04T12:44:29.0591923Z 2025-12-04T12:44:29.0591999Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0592409Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0592748Z 2025-12-04T12:44:29.0592835Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0592958Z 2025-12-04T12:44:29.0593024Z Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.0593204Z Traceback (most recent call last): 2025-12-04T12:44:29.0593454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0593697Z getattr(self, test_name)() 2025-12-04T12:44:29.0593935Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0594167Z fn() 2025-12-04T12:44:29.0594370Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0594598Z method(*args, **kwargs) 2025-12-04T12:44:29.0594815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0595042Z method(*args, **kwargs) 2025-12-04T12:44:29.0595266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0595491Z with policy(): 2025-12-04T12:44:29.0595702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0595933Z raise RuntimeError(msg) 2025-12-04T12:44:29.0596396Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3053453312. 2025-12-04T12:44:29.0596841Z 2025-12-04T12:44:29.0596918Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0597329Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0597685Z 2025-12-04T12:44:29.0597772Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0597898Z 2025-12-04T12:44:29.0597900Z 2025-12-04T12:44:29.0597976Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0598174Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0598565Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-7802117e87e0d3bb.xml - 2025-12-04T12:44:29.0598930Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0599339Z FAILED [10.0206s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0599770Z Traceback (most recent call last): 2025-12-04T12:44:29.0600015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0600259Z getattr(self, test_name)() 2025-12-04T12:44:29.0600492Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0600747Z fn() 2025-12-04T12:44:29.0600948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0601176Z method(*args, **kwargs) 2025-12-04T12:44:29.0601395Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0601620Z method(*args, **kwargs) 2025-12-04T12:44:29.0601842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0602101Z with policy(): 2025-12-04T12:44:29.0602310Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0602540Z raise RuntimeError(msg) 2025-12-04T12:44:29.0603004Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3219128320. 2025-12-04T12:44:29.0603440Z 2025-12-04T12:44:29.0603515Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0603928Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0604266Z 2025-12-04T12:44:29.0604359Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0604482Z 2025-12-04T12:44:29.0604541Z Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.0604681Z Traceback (most recent call last): 2025-12-04T12:44:29.0604925Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0605168Z getattr(self, test_name)() 2025-12-04T12:44:29.0605416Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0605647Z fn() 2025-12-04T12:44:29.0605846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0606075Z method(*args, **kwargs) 2025-12-04T12:44:29.0606313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0606542Z method(*args, **kwargs) 2025-12-04T12:44:29.0606758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0606982Z with policy(): 2025-12-04T12:44:29.0607195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0607424Z raise RuntimeError(msg) 2025-12-04T12:44:29.0607892Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3053453312. 2025-12-04T12:44:29.0608318Z 2025-12-04T12:44:29.0608399Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0608814Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0609153Z 2025-12-04T12:44:29.0609240Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0609431Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0609637Z ======================= 1 failed, 1 deselected in 10.03s ======================= 2025-12-04T12:44:29.0609779Z Got exit code 1 2025-12-04T12:44:29.0609877Z Retrying single test... 2025-12-04T12:44:29.0610168Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-483b30c5bd7b37b3.xml 2025-12-04T12:44:29.0610487Z ============================= test session starts ============================== 2025-12-04T12:44:29.0610744Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0610937Z cachedir: .pytest_cache 2025-12-04T12:44:29.0611160Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0611399Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0611521Z configfile: pytest.ini 2025-12-04T12:44:29.0611751Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0612026Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.0612429Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0612799Z Running 1 items in this shard 2025-12-04T12:44:29.0612874Z 2025-12-04T12:44:29.0613254Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda I1204 12:40:06.996000 341674 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 341743 2025-12-04T12:44:29.0613817Z I1204 12:40:06.997000 341674 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 341744 2025-12-04T12:44:29.0614159Z I1204 12:40:06.997000 341674 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 341745 2025-12-04T12:44:29.0614518Z I1204 12:40:06.998000 341674 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 341746 2025-12-04T12:44:29.0615390Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0616159Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0616899Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0617636Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0618369Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0619106Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0619881Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0620657Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0621994Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0623398Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0624815Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0626248Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0627683Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0629107Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0630620Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0632052Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0632346Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0632673Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0633151Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0633623Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0634094Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0634545Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0634971Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0635435Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0635884Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0636334Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0636785Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0637224Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0637666Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0638116Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0638811Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3219128320. 2025-12-04T12:44:29.0639471Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0639854Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0640542Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0641113Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0641467Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0641865Z E1204 12:40:15.680000 341743 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0642193Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0642521Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0642996Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0643461Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0643922Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0644369Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0644794Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0645257Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0645709Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0646154Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0646602Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0647039Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0647484Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0647934Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0648626Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3053453312. 2025-12-04T12:44:29.0649279Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0649686Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0650336Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0650900Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0651251Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0651646Z E1204 12:40:15.702000 341745 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0651978Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0652301Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0652775Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0653238Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0653717Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0654166Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0654593Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0655040Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0655491Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0655938Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0656385Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0656826Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0657262Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0657715Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0658410Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 958398464 and is now 3053453312. 2025-12-04T12:44:29.0659086Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0659421Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0660100Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0660664Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0661015Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0661415Z E1204 12:40:15.725000 341746 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0661742Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0662063Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0662533Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0663012Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0663474Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0663919Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0664342Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0664790Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0665239Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0665683Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0666135Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0666571Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0667007Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0667460Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0668180Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3053453312. 2025-12-04T12:44:29.0668836Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0669170Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0669851Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0670414Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0670764Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0671167Z E1204 12:40:15.750000 341744 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0671404Z FAILED [10.0209s] [100%] 2025-12-04T12:44:29.0671475Z 2025-12-04T12:44:29.0671535Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0671781Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.0672032Z Traceback (most recent call last): 2025-12-04T12:44:29.0672281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0672529Z self._join_processes(fn) 2025-12-04T12:44:29.0672796Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0673063Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0673334Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0673595Z raise RuntimeError(error) 2025-12-04T12:44:29.0673748Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0673911Z Traceback (most recent call last): 2025-12-04T12:44:29.0674157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0674401Z getattr(self, test_name)() 2025-12-04T12:44:29.0674636Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0674871Z fn() 2025-12-04T12:44:29.0675079Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0675311Z method(*args, **kwargs) 2025-12-04T12:44:29.0675532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0675763Z method(*args, **kwargs) 2025-12-04T12:44:29.0675984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0676213Z with policy(): 2025-12-04T12:44:29.0676433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0676672Z raise RuntimeError(msg) 2025-12-04T12:44:29.0677172Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3219128320. 2025-12-04T12:44:29.0677604Z 2025-12-04T12:44:29.0677680Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0678091Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0678433Z 2025-12-04T12:44:29.0678524Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0678650Z 2025-12-04T12:44:29.0678651Z 2025-12-04T12:44:29.0678733Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0678935Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0679336Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-483b30c5bd7b37b3.xml - 2025-12-04T12:44:29.0679746Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0680157Z FAILED [10.0209s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0680547Z Traceback (most recent call last): 2025-12-04T12:44:29.0680794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0681057Z getattr(self, test_name)() 2025-12-04T12:44:29.0681293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0681529Z fn() 2025-12-04T12:44:29.0681758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0681990Z method(*args, **kwargs) 2025-12-04T12:44:29.0682213Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0682443Z method(*args, **kwargs) 2025-12-04T12:44:29.0682662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0687545Z with policy(): 2025-12-04T12:44:29.0687782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0688017Z raise RuntimeError(msg) 2025-12-04T12:44:29.0688488Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3219128320. 2025-12-04T12:44:29.0688919Z 2025-12-04T12:44:29.0688997Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0689411Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0689793Z 2025-12-04T12:44:29.0689882Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0690073Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0690238Z ======================= 1 failed, 7 deselected in 10.03s ======================= 2025-12-04T12:44:29.0690375Z Got exit code 1 2025-12-04T12:44:29.0690474Z Retrying single test... 2025-12-04T12:44:29.0690816Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-ae4bcd06d210b4c8.xml 2025-12-04T12:44:29.0691140Z ============================= test session starts ============================== 2025-12-04T12:44:29.0691355Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0691548Z cachedir: .pytest_cache 2025-12-04T12:44:29.0691770Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0692013Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0692130Z configfile: pytest.ini 2025-12-04T12:44:29.0692358Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0692627Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.0693029Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0693398Z Running 1 items in this shard 2025-12-04T12:44:29.0693471Z 2025-12-04T12:44:29.0693849Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda I1204 12:40:19.921000 342144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 342213 2025-12-04T12:44:29.0694429Z I1204 12:40:19.922000 342144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 342214 2025-12-04T12:44:29.0694766Z I1204 12:40:19.922000 342144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 342215 2025-12-04T12:44:29.0695105Z I1204 12:40:19.923000 342144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 342216 2025-12-04T12:44:29.0695999Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0696745Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0697485Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0698228Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0698963Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0699735Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0700497Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:243: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0701233Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0702589Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0704006Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0705425Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0706861Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0708270Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0709721Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0711162Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T12:44:29.0712577Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.0712866Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0713196Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0713666Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0714128Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0714593Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0715038Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0715463Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0715920Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0716370Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0716817Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0717265Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0717708Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0718151Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0718599Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0719300Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3376414720. 2025-12-04T12:44:29.0720002Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0720366Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0721017Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0721583Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0721937Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0722336Z E1204 12:40:28.497000 342213 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0722666Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0722984Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0723454Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0724216Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0724678Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0725121Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0725543Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0725989Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0726436Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0726887Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0727335Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0727769Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0728210Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0728660Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0729359Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1107296256 and is now 3422552064. 2025-12-04T12:44:29.0730077Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0730408Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0731054Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0731615Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0731963Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0732366Z E1204 12:40:28.555000 342216 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0732688Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0733010Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0733481Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0733959Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0734422Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0734866Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0735289Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0735743Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0736196Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0736644Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0737098Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0737539Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0737978Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0738427Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0739139Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1262485504 and is now 3219128320. 2025-12-04T12:44:29.0739835Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0740170Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0740813Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0741375Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0741721Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0742117Z E1204 12:40:28.585000 342215 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0742441Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0742763Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0743246Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0743712Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0744188Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0744617Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0745034Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0745483Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0745930Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0746378Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0746824Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0747253Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0747529Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0747669Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0748321Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3053453312. 2025-12-04T12:44:29.0748431Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0748619Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0749045Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0749157Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0749366Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0749528Z E1204 12:40:29.018000 342214 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0749611Z FAILED [9.9199s] [100%] 2025-12-04T12:44:29.0749630Z 2025-12-04T12:44:29.0749691Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0749838Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.0749889Z Traceback (most recent call last): 2025-12-04T12:44:29.0750052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0750123Z self._join_processes(fn) 2025-12-04T12:44:29.0750298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0750357Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0750537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0750583Z raise RuntimeError(error) 2025-12-04T12:44:29.0750667Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0750715Z Traceback (most recent call last): 2025-12-04T12:44:29.0750876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0750923Z getattr(self, test_name)() 2025-12-04T12:44:29.0751089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0751126Z fn() 2025-12-04T12:44:29.0751280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0751324Z method(*args, **kwargs) 2025-12-04T12:44:29.0751476Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0751520Z method(*args, **kwargs) 2025-12-04T12:44:29.0751674Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0751715Z with policy(): 2025-12-04T12:44:29.0751869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0751912Z raise RuntimeError(msg) 2025-12-04T12:44:29.0752354Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3376414720. 2025-12-04T12:44:29.0752357Z 2025-12-04T12:44:29.0752435Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0752740Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0752743Z 2025-12-04T12:44:29.0752830Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0752833Z 2025-12-04T12:44:29.0752834Z 2025-12-04T12:44:29.0752914Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0753003Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0753281Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-ae4bcd06d210b4c8.xml - 2025-12-04T12:44:29.0753345Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0753660Z FAILED [9.9199s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0753722Z Traceback (most recent call last): 2025-12-04T12:44:29.0753960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0754005Z getattr(self, test_name)() 2025-12-04T12:44:29.0754180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0754224Z fn() 2025-12-04T12:44:29.0754377Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0754419Z method(*args, **kwargs) 2025-12-04T12:44:29.0754571Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0754613Z method(*args, **kwargs) 2025-12-04T12:44:29.0754762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0754802Z with policy(): 2025-12-04T12:44:29.0754956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0754999Z raise RuntimeError(msg) 2025-12-04T12:44:29.0755398Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3376414720. 2025-12-04T12:44:29.0755403Z 2025-12-04T12:44:29.0755479Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0755782Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0755785Z 2025-12-04T12:44:29.0755871Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0755937Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0755998Z ======================= 1 failed, 7 deselected in 9.93s ======================== 2025-12-04T12:44:29.0756039Z Got exit code 1 2025-12-04T12:44:29.0756314Z FAILED CONSISTENTLY: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0756448Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:44:29.0756677Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-b149cca1e6d1f159.xml 2025-12-04T12:44:29.0756740Z ============================= test session starts ============================== 2025-12-04T12:44:29.0756855Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0756899Z cachedir: .pytest_cache 2025-12-04T12:44:29.0757061Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0757108Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0757156Z configfile: pytest.ini 2025-12-04T12:44:29.0757318Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0757391Z collecting ... collected 8 items / 2 deselected / 6 selected 2025-12-04T12:44:29.0757445Z stepcurrent: skipping 2 already run items. 2025-12-04T12:44:29.0757491Z Running 6 items in this shard 2025-12-04T12:44:29.0757493Z 2025-12-04T12:44:29.0757868Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda I1204 12:40:32.604000 342614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 342683 2025-12-04T12:44:29.0758038Z I1204 12:40:32.605000 342614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 342684 2025-12-04T12:44:29.0758201Z I1204 12:40:32.605000 342614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 342685 2025-12-04T12:44:29.0758354Z I1204 12:40:32.606000 342614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 342686 2025-12-04T12:44:29.0759042Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0759086Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0759789Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0759833Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0760497Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0760545Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0761234Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0761278Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0761776Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0761829Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0762319Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0762367Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0762871Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0762930Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0763416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0763465Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0763601Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0763759Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0764046Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0764198Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0764477Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0764595Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0764867Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0765009Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0765299Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0765442Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0765714Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0765844Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0766118Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0766261Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0766782Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3472883712. 2025-12-04T12:44:29.0766901Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0767092Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0767515Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0767633Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0767836Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0767995Z E1204 12:40:41.384000 342683 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0768129Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0768280Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0768562Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0768711Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0768988Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0769107Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0769376Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0769542Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0769846Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0769986Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0770258Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0770386Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0770659Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0770801Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0771330Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1262485504 and is now 3307208704. 2025-12-04T12:44:29.0771450Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0771638Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0772070Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0772178Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0772380Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0772538Z E1204 12:40:41.395000 342686 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0772667Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0772820Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0773099Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0773244Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0773522Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0773636Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0773931Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0774073Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0774341Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0774480Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0774748Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0774879Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0775150Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0775290Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0775809Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1254096896 and is now 3307208704. 2025-12-04T12:44:29.0775926Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0776126Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0776544Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0776653Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0776854Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0777009Z E1204 12:40:41.430000 342685 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0777143Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0777294Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0777574Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0777722Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0778001Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0778138Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0778406Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0778544Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0778809Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0778949Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0779219Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0779346Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0779645Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0779788Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0780322Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3307208704. 2025-12-04T12:44:29.0780443Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0780632Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0781051Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0781159Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0781362Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0781524Z E1204 12:40:41.432000 342684 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0781570Z FAILED [10.1203s] [ 16%] 2025-12-04T12:44:29.0781572Z 2025-12-04T12:44:29.0781627Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0781776Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.0781823Z Traceback (most recent call last): 2025-12-04T12:44:29.0781988Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0782032Z self._join_processes(fn) 2025-12-04T12:44:29.0782209Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0782266Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0782471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0782516Z raise RuntimeError(error) 2025-12-04T12:44:29.0782597Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0782643Z Traceback (most recent call last): 2025-12-04T12:44:29.0782805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0782848Z getattr(self, test_name)() 2025-12-04T12:44:29.0783010Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0783044Z fn() 2025-12-04T12:44:29.0783199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0783240Z method(*args, **kwargs) 2025-12-04T12:44:29.0783394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0783434Z method(*args, **kwargs) 2025-12-04T12:44:29.0783587Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0783625Z with policy(): 2025-12-04T12:44:29.0783778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0783832Z raise RuntimeError(msg) 2025-12-04T12:44:29.0784232Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3472883712. 2025-12-04T12:44:29.0784251Z 2025-12-04T12:44:29.0784333Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0784636Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0784639Z 2025-12-04T12:44:29.0784728Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0784730Z 2025-12-04T12:44:29.0784790Z Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.0784837Z Traceback (most recent call last): 2025-12-04T12:44:29.0784999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0785042Z getattr(self, test_name)() 2025-12-04T12:44:29.0785202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0785239Z fn() 2025-12-04T12:44:29.0785393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0785432Z method(*args, **kwargs) 2025-12-04T12:44:29.0785583Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0785622Z method(*args, **kwargs) 2025-12-04T12:44:29.0785775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0785812Z with policy(): 2025-12-04T12:44:29.0785966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0786006Z raise RuntimeError(msg) 2025-12-04T12:44:29.0786431Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1262485504 and is now 3307208704. 2025-12-04T12:44:29.0786435Z 2025-12-04T12:44:29.0786509Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0786811Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0786814Z 2025-12-04T12:44:29.0786903Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0786905Z 2025-12-04T12:44:29.0786907Z 2025-12-04T12:44:29.0786983Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0787072Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0787351Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-b149cca1e6d1f159.xml - 2025-12-04T12:44:29.0787414Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0787727Z FAILED [10.1203s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0787787Z Traceback (most recent call last): 2025-12-04T12:44:29.0787952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0787995Z getattr(self, test_name)() 2025-12-04T12:44:29.0788156Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0788202Z fn() 2025-12-04T12:44:29.0788355Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0788398Z method(*args, **kwargs) 2025-12-04T12:44:29.0788551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0788590Z method(*args, **kwargs) 2025-12-04T12:44:29.0788742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0788780Z with policy(): 2025-12-04T12:44:29.0788933Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0788974Z raise RuntimeError(msg) 2025-12-04T12:44:29.0789375Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3472883712. 2025-12-04T12:44:29.0789379Z 2025-12-04T12:44:29.0789451Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0789800Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0789803Z 2025-12-04T12:44:29.0789888Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0789890Z 2025-12-04T12:44:29.0789950Z Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.0789994Z Traceback (most recent call last): 2025-12-04T12:44:29.0790157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0790227Z getattr(self, test_name)() 2025-12-04T12:44:29.0790386Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0790422Z fn() 2025-12-04T12:44:29.0790573Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0790615Z method(*args, **kwargs) 2025-12-04T12:44:29.0790766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0790806Z method(*args, **kwargs) 2025-12-04T12:44:29.0790956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0790995Z with policy(): 2025-12-04T12:44:29.0791147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0791193Z raise RuntimeError(msg) 2025-12-04T12:44:29.0791589Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1262485504 and is now 3307208704. 2025-12-04T12:44:29.0791591Z 2025-12-04T12:44:29.0791664Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0791976Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0791980Z 2025-12-04T12:44:29.0792065Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0792143Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0792206Z ======================= 1 failed, 2 deselected in 10.13s ======================= 2025-12-04T12:44:29.0792245Z Got exit code 1 2025-12-04T12:44:29.0792285Z Retrying single test... 2025-12-04T12:44:29.0792513Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-9b1a90df380fa266.xml 2025-12-04T12:44:29.0792571Z ============================= test session starts ============================== 2025-12-04T12:44:29.0792689Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0792731Z cachedir: .pytest_cache 2025-12-04T12:44:29.0792890Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0792935Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0792978Z configfile: pytest.ini 2025-12-04T12:44:29.0793143Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0793216Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.0793515Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0793559Z Running 1 items in this shard 2025-12-04T12:44:29.0793562Z 2025-12-04T12:44:29.0793936Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda I1204 12:40:45.488000 343084 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 343153 2025-12-04T12:44:29.0794094Z I1204 12:40:45.488000 343084 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 343154 2025-12-04T12:44:29.0794266Z I1204 12:40:45.489000 343084 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 343155 2025-12-04T12:44:29.0794416Z I1204 12:40:45.489000 343084 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 343156 2025-12-04T12:44:29.0795097Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0795142Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0795827Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0795871Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0796541Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0796604Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0797264Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0797310Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0797809Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0797860Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0798361Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0798408Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0798897Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0798963Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0799447Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0799493Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0799677Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0799833Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0800116Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0800264Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0800543Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0800658Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0800952Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0801091Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0801374Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0801513Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0801781Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0801909Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0802181Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0802325Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0802840Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1113587712 and is now 3491758080. 2025-12-04T12:44:29.0802951Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0803143Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0803595Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0803703Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0803904Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0804065Z E1204 12:40:54.214000 343156 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0804194Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0804347Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0804626Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0804774Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0805052Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0805177Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0805444Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0805595Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0805864Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0806001Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0806268Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0806395Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0806667Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0806805Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0807320Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1266679808 and is now 3556769792. 2025-12-04T12:44:29.0807428Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0807639Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0808060Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0808165Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0808370Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0808530Z E1204 12:40:54.239000 343155 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0808657Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0808813Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0809091Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0809235Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0809522Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0809671Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0809954Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0810094Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0810362Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0810503Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0810771Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0810899Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0811169Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0811307Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0811823Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3468689408. 2025-12-04T12:44:29.0811958Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0812145Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0812566Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0812673Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0812875Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0813034Z E1204 12:40:54.723000 343153 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0813164Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0813317Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0813594Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0813755Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0814029Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0814157Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0814424Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0814565Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0814840Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0814976Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0815248Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0815374Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0815643Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0815781Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0816325Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3307208704. 2025-12-04T12:44:29.0816435Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0816625Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0817047Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0817153Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0817359Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0817517Z E1204 12:40:54.766000 343154 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0817561Z FAILED [10.1212s] [100%] 2025-12-04T12:44:29.0817563Z 2025-12-04T12:44:29.0817618Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0817769Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.0817826Z Traceback (most recent call last): 2025-12-04T12:44:29.0817988Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0818034Z self._join_processes(fn) 2025-12-04T12:44:29.0818206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0818272Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0818451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0818497Z raise RuntimeError(error) 2025-12-04T12:44:29.0818576Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.0818624Z Traceback (most recent call last): 2025-12-04T12:44:29.0818788Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0818833Z getattr(self, test_name)() 2025-12-04T12:44:29.0818993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0819030Z fn() 2025-12-04T12:44:29.0819181Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0819225Z method(*args, **kwargs) 2025-12-04T12:44:29.0819374Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0819415Z method(*args, **kwargs) 2025-12-04T12:44:29.0819563Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0819636Z with policy(): 2025-12-04T12:44:29.0819790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0819832Z raise RuntimeError(msg) 2025-12-04T12:44:29.0820226Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1113587712 and is now 3491758080. 2025-12-04T12:44:29.0820230Z 2025-12-04T12:44:29.0820330Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0820633Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0820636Z 2025-12-04T12:44:29.0820721Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0820724Z 2025-12-04T12:44:29.0820726Z 2025-12-04T12:44:29.0820801Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0820886Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0821159Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-9b1a90df380fa266.xml - 2025-12-04T12:44:29.0821221Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0821537Z FAILED [10.1212s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.0821585Z Traceback (most recent call last): 2025-12-04T12:44:29.0821750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0821810Z getattr(self, test_name)() 2025-12-04T12:44:29.0821969Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0822007Z fn() 2025-12-04T12:44:29.0822159Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0822216Z method(*args, **kwargs) 2025-12-04T12:44:29.0822368Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0822412Z method(*args, **kwargs) 2025-12-04T12:44:29.0822564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0822604Z with policy(): 2025-12-04T12:44:29.0822756Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0822799Z raise RuntimeError(msg) 2025-12-04T12:44:29.0823196Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1113587712 and is now 3491758080. 2025-12-04T12:44:29.0823199Z 2025-12-04T12:44:29.0823277Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0823588Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0823590Z 2025-12-04T12:44:29.0823677Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0823744Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0823807Z ======================= 1 failed, 7 deselected in 10.13s ======================= 2025-12-04T12:44:29.0823847Z Got exit code 1 2025-12-04T12:44:29.0823889Z Retrying single test... 2025-12-04T12:44:29.0824117Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-88af6fa056c2577f.xml 2025-12-04T12:44:29.0824199Z ============================= test session starts ============================== 2025-12-04T12:44:29.0824313Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0824353Z cachedir: .pytest_cache 2025-12-04T12:44:29.0824513Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0824559Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0824603Z configfile: pytest.ini 2025-12-04T12:44:29.0824766Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0824839Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.0825132Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0825178Z Running 1 items in this shard 2025-12-04T12:44:29.0825181Z 2025-12-04T12:44:29.0825555Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda I1204 12:40:58.356000 343554 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 343623 2025-12-04T12:44:29.0825711Z I1204 12:40:58.356000 343554 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 343624 2025-12-04T12:44:29.0825874Z I1204 12:40:58.357000 343554 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 343625 2025-12-04T12:44:29.0826024Z I1204 12:40:58.358000 343554 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 343626 2025-12-04T12:44:29.0826715Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0826757Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0827427Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0827472Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0828135Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0828178Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0828856Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0828899Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0829395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0829443Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0829973Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0830021Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0830508Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0830570Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0831056Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0831116Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0831251Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0831404Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0831688Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0831835Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0832116Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0832231Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0832502Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0832641Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0832909Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0833048Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0833347Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0833476Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0833749Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0833891Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0834409Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1268776960 and is now 3307208704. 2025-12-04T12:44:29.0834518Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0834706Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0835130Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0835252Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0835468Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0835626Z E1204 12:41:07.063000 343626 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0835754Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0835906Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0836186Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0836331Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0836611Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0836725Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0836995Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0837135Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0837407Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0837566Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0837837Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0837964Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0838238Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0838379Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0838898Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3307208704. 2025-12-04T12:44:29.0839006Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0839194Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0839660Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0839782Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0839985Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0840143Z E1204 12:41:07.137000 343624 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0840271Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0840423Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0840701Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0840852Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0841131Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0841245Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0841516Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0841654Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0841945Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0842083Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0842353Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0842480Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0842750Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0842893Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0843409Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3468689408. 2025-12-04T12:44:29.0843529Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0843716Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0844137Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0844253Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0844456Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0844614Z E1204 12:41:07.137000 343623 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0844743Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0844895Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0845175Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0845322Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0845598Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0845714Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0845984Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0846146Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0846415Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0846552Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0846824Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0846951Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0847222Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0847361Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0847876Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3307208704. 2025-12-04T12:44:29.0847992Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0848181Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0848619Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0848724Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0848927Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0849085Z E1204 12:41:07.142000 343625 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0849129Z FAILED [10.0211s] [100%] 2025-12-04T12:44:29.0849130Z 2025-12-04T12:44:29.0849189Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0849338Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.0849386Z Traceback (most recent call last): 2025-12-04T12:44:29.0849548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0849628Z self._join_processes(fn) 2025-12-04T12:44:29.0849802Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0849858Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0850038Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0850083Z raise RuntimeError(error) 2025-12-04T12:44:29.0850162Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.0850208Z Traceback (most recent call last): 2025-12-04T12:44:29.0850398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0850443Z getattr(self, test_name)() 2025-12-04T12:44:29.0850602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0850637Z fn() 2025-12-04T12:44:29.0850789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0850831Z method(*args, **kwargs) 2025-12-04T12:44:29.0850980Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0851021Z method(*args, **kwargs) 2025-12-04T12:44:29.0851171Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0851211Z with policy(): 2025-12-04T12:44:29.0851367Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0851409Z raise RuntimeError(msg) 2025-12-04T12:44:29.0851812Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1268776960 and is now 3307208704. 2025-12-04T12:44:29.0851829Z 2025-12-04T12:44:29.0851904Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0852207Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0852221Z 2025-12-04T12:44:29.0852309Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0852311Z 2025-12-04T12:44:29.0852313Z 2025-12-04T12:44:29.0852391Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0852479Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0852752Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-88af6fa056c2577f.xml - 2025-12-04T12:44:29.0852814Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0853127Z FAILED [10.0211s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.0853177Z Traceback (most recent call last): 2025-12-04T12:44:29.0853342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0853387Z getattr(self, test_name)() 2025-12-04T12:44:29.0853549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0853585Z fn() 2025-12-04T12:44:29.0853736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0853778Z method(*args, **kwargs) 2025-12-04T12:44:29.0853930Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0853971Z method(*args, **kwargs) 2025-12-04T12:44:29.0854122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0854160Z with policy(): 2025-12-04T12:44:29.0854330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0854373Z raise RuntimeError(msg) 2025-12-04T12:44:29.0854770Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1268776960 and is now 3307208704. 2025-12-04T12:44:29.0854775Z 2025-12-04T12:44:29.0854850Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0855152Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0855155Z 2025-12-04T12:44:29.0855243Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0855306Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0855365Z ======================= 1 failed, 7 deselected in 10.03s ======================= 2025-12-04T12:44:29.0855405Z Got exit code 1 2025-12-04T12:44:29.0855658Z FAILED CONSISTENTLY: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0855798Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:44:29.0856025Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-3dbaaa6d6cb49267.xml 2025-12-04T12:44:29.0856085Z ============================= test session starts ============================== 2025-12-04T12:44:29.0856212Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0856256Z cachedir: .pytest_cache 2025-12-04T12:44:29.0856416Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0856462Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0856503Z configfile: pytest.ini 2025-12-04T12:44:29.0856664Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0856737Z collecting ... collected 8 items / 3 deselected / 5 selected 2025-12-04T12:44:29.0856790Z stepcurrent: skipping 3 already run items. 2025-12-04T12:44:29.0856836Z Running 5 items in this shard 2025-12-04T12:44:29.0856838Z 2025-12-04T12:44:29.0857212Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda I1204 12:41:10.978000 344024 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 344093 2025-12-04T12:44:29.0857369Z I1204 12:41:10.979000 344024 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 344094 2025-12-04T12:44:29.0857520Z I1204 12:41:10.979000 344024 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 344095 2025-12-04T12:44:29.0857674Z I1204 12:41:10.980000 344024 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 344096 2025-12-04T12:44:29.0858372Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0858417Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0859085Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0859127Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0859820Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0859863Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0860521Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0860581Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0861089Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0861140Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0861629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0861677Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0862168Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0862214Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0862709Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0862755Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0862889Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0863069Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0863350Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0863498Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0863778Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0863893Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0864164Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0864306Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0864575Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0864731Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0865000Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0865141Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0865412Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0865550Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0866068Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3449815040. 2025-12-04T12:44:29.0866179Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0866371Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0866795Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0866902Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0867103Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0867260Z E1204 12:41:19.879000 344093 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0867411Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0867562Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0867841Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0867989Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0868268Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0868388Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0868657Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0868798Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0869065Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0869214Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0869485Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0869687Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0869961Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0870098Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0870614Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1262485504 and is now 3284140032. 2025-12-04T12:44:29.0870723Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0870914Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0871333Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0871439Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0871643Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0872356Z E1204 12:41:19.936000 344095 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0872487Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0872638Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0872918Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0873064Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0873346Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0873462Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0873731Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0873870Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0874153Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0874292Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0874576Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0874703Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0874972Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0875112Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0875627Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1268776960 and is now 3284140032. 2025-12-04T12:44:29.0875734Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0875921Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0876340Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0876446Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0876669Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0876826Z E1204 12:41:19.937000 344096 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0876955Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0877107Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0877388Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0877535Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0877815Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0877930Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0878201Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0878354Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0878622Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0878771Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0879042Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0879169Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0879441Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0879622Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0880137Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3284140032. 2025-12-04T12:44:29.0880243Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0880434Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0880851Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0880992Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0881197Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0881352Z E1204 12:41:19.950000 344094 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0881394Z FAILED [10.2207s] [ 20%] 2025-12-04T12:44:29.0881397Z 2025-12-04T12:44:29.0881452Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0881599Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.0881645Z Traceback (most recent call last): 2025-12-04T12:44:29.0881809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0881854Z self._join_processes(fn) 2025-12-04T12:44:29.0882028Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0882081Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0882259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0882321Z raise RuntimeError(error) 2025-12-04T12:44:29.0882402Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0882446Z Traceback (most recent call last): 2025-12-04T12:44:29.0882609Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0882650Z getattr(self, test_name)() 2025-12-04T12:44:29.0882824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0882858Z fn() 2025-12-04T12:44:29.0883012Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0883051Z method(*args, **kwargs) 2025-12-04T12:44:29.0883205Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0883245Z method(*args, **kwargs) 2025-12-04T12:44:29.0883395Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0883434Z with policy(): 2025-12-04T12:44:29.0883588Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0883631Z raise RuntimeError(msg) 2025-12-04T12:44:29.0884027Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3449815040. 2025-12-04T12:44:29.0884030Z 2025-12-04T12:44:29.0884108Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0884409Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0884412Z 2025-12-04T12:44:29.0884501Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0884504Z 2025-12-04T12:44:29.0884505Z 2025-12-04T12:44:29.0884581Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0884668Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0884962Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-3dbaaa6d6cb49267.xml - 2025-12-04T12:44:29.0885023Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0885337Z FAILED [10.2207s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.0885384Z Traceback (most recent call last): 2025-12-04T12:44:29.0885550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0885592Z getattr(self, test_name)() 2025-12-04T12:44:29.0885755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0885790Z fn() 2025-12-04T12:44:29.0885941Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0885981Z method(*args, **kwargs) 2025-12-04T12:44:29.0886134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0886173Z method(*args, **kwargs) 2025-12-04T12:44:29.0886334Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0886371Z with policy(): 2025-12-04T12:44:29.0886524Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0886564Z raise RuntimeError(msg) 2025-12-04T12:44:29.0886971Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3449815040. 2025-12-04T12:44:29.0886973Z 2025-12-04T12:44:29.0887048Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0887352Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0887355Z 2025-12-04T12:44:29.0887442Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0887504Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0887566Z ======================= 1 failed, 3 deselected in 10.23s ======================= 2025-12-04T12:44:29.0887603Z Got exit code 1 2025-12-04T12:44:29.0887646Z Retrying single test... 2025-12-04T12:44:29.0887870Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-015a285330bddf1a.xml 2025-12-04T12:44:29.0887928Z ============================= test session starts ============================== 2025-12-04T12:44:29.0888041Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0888084Z cachedir: .pytest_cache 2025-12-04T12:44:29.0888243Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0888288Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0888329Z configfile: pytest.ini 2025-12-04T12:44:29.0888492Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0888589Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.0888882Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0888927Z Running 1 items in this shard 2025-12-04T12:44:29.0888929Z 2025-12-04T12:44:29.0889302Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda I1204 12:41:23.999000 344494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 344563 2025-12-04T12:44:29.0889460Z I1204 12:41:24.000000 344494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 344564 2025-12-04T12:44:29.0889647Z I1204 12:41:24.001000 344494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 344565 2025-12-04T12:44:29.0889801Z I1204 12:41:24.001000 344494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 344566 2025-12-04T12:44:29.0890488Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0890547Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0891217Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0891274Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0891941Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0891985Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0892649Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0892692Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0893183Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0893233Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0893746Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0893793Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0894276Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0894322Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0894816Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0894863Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0894996Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0895159Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0895440Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0895598Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0895875Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0895991Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0896260Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0896401Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0896672Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0896812Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0897079Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0897206Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0897479Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0897617Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0898157Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1268776960 and is now 3284140032. 2025-12-04T12:44:29.0898266Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0898457Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0898880Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0898986Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0899190Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0899346Z E1204 12:41:32.837000 344566 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0899487Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0899683Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0899963Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0900123Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0900397Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0900514Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0900782Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0900921Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0901190Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0901329Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0901596Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0901724Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0901992Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0902155Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0902669Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3284140032. 2025-12-04T12:44:29.0902778Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0902966Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0903389Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0903495Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0903696Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0903867Z E1204 12:41:32.844000 344565 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0903996Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0904145Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0904434Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0904580Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0904857Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0904972Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0905239Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0905381Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0905648Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0905788Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0906057Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0906183Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0906477Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0906616Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0907125Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3284140032. 2025-12-04T12:44:29.0907233Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0907424Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0907845Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0907950Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0908161Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0908318Z E1204 12:41:32.895000 344564 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0908457Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0908609Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0908888Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0909031Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0909308Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0909423Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0909735Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0909874Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0910149Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0910289Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0910556Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0910709Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0910980Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0911117Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0911628Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3443523584. 2025-12-04T12:44:29.0911736Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0911925Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0912352Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0912473Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0912673Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0912851Z E1204 12:41:32.906000 344563 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0912893Z FAILED [10.1198s] [100%] 2025-12-04T12:44:29.0912895Z 2025-12-04T12:44:29.0912950Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0913096Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.0913141Z Traceback (most recent call last): 2025-12-04T12:44:29.0913305Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0913349Z self._join_processes(fn) 2025-12-04T12:44:29.0913522Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0913575Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0913757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0913800Z raise RuntimeError(error) 2025-12-04T12:44:29.0913880Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.0913925Z Traceback (most recent call last): 2025-12-04T12:44:29.0914089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0914132Z getattr(self, test_name)() 2025-12-04T12:44:29.0914292Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0914327Z fn() 2025-12-04T12:44:29.0914481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0914521Z method(*args, **kwargs) 2025-12-04T12:44:29.0914674Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0914734Z method(*args, **kwargs) 2025-12-04T12:44:29.0914885Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0914924Z with policy(): 2025-12-04T12:44:29.0915076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0915117Z raise RuntimeError(msg) 2025-12-04T12:44:29.0915511Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3284140032. 2025-12-04T12:44:29.0915514Z 2025-12-04T12:44:29.0915590Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0915892Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0915895Z 2025-12-04T12:44:29.0915982Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0915984Z 2025-12-04T12:44:29.0915986Z 2025-12-04T12:44:29.0916060Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0916156Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0916425Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-015a285330bddf1a.xml - 2025-12-04T12:44:29.0916484Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0916807Z FAILED [10.1198s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.0916853Z Traceback (most recent call last): 2025-12-04T12:44:29.0917020Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0917062Z getattr(self, test_name)() 2025-12-04T12:44:29.0917223Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0917258Z fn() 2025-12-04T12:44:29.0917411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0917450Z method(*args, **kwargs) 2025-12-04T12:44:29.0917602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0917644Z method(*args, **kwargs) 2025-12-04T12:44:29.0917795Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0917833Z with policy(): 2025-12-04T12:44:29.0917983Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0918024Z raise RuntimeError(msg) 2025-12-04T12:44:29.0918421Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3284140032. 2025-12-04T12:44:29.0918424Z 2025-12-04T12:44:29.0918499Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0918818Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0918820Z 2025-12-04T12:44:29.0918908Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0918970Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0919033Z ======================= 1 failed, 7 deselected in 10.13s ======================= 2025-12-04T12:44:29.0919071Z Got exit code 1 2025-12-04T12:44:29.0919111Z Retrying single test... 2025-12-04T12:44:29.0919338Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-01fc5e09ab395607.xml 2025-12-04T12:44:29.0919397Z ============================= test session starts ============================== 2025-12-04T12:44:29.0919512Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0919553Z cachedir: .pytest_cache 2025-12-04T12:44:29.0919748Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0919793Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0919834Z configfile: pytest.ini 2025-12-04T12:44:29.0919996Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0920083Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.0920376Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0920419Z Running 1 items in this shard 2025-12-04T12:44:29.0920435Z 2025-12-04T12:44:29.0920808Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda I1204 12:41:36.726000 344964 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 345033 2025-12-04T12:44:29.0920964Z I1204 12:41:36.726000 344964 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 345034 2025-12-04T12:44:29.0921117Z I1204 12:41:36.727000 344964 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 345035 2025-12-04T12:44:29.0921270Z I1204 12:41:36.727000 344964 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 345036 2025-12-04T12:44:29.0921953Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0921996Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0922660Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0922702Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0923390Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0923434Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0924097Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:188: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0924141Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0924640Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0924688Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0925189Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0925245Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0925734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0925780Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0926266Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0926313Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0926449Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0926603Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0926891Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0927038Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0927315Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0927434Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0927721Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0927861Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0928131Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0928270Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0928541Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0928671Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0928946Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0929085Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0929651Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3424649216. 2025-12-04T12:44:29.0929776Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0929965Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0930385Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0930493Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0930696Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0930856Z E1204 12:41:45.500000 345034 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0930984Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0931138Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0931416Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0931563Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0931872Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0931988Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0932255Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0932393Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0932662Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0932799Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0933070Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0933197Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0933469Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0933620Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0934135Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3368026112. 2025-12-04T12:44:29.0934253Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0934441Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0934860Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0934967Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0935173Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0935329Z E1204 12:41:45.507000 345035 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.0935458Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0935609Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0935892Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0936037Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0936332Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0936447Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0936716Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0936857Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0937125Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0937265Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0937534Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0937660Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0937943Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0938082Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0938605Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1254096896 and is now 3368026112. 2025-12-04T12:44:29.0938713Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0938900Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0939320Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0939428Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0939671Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0939828Z E1204 12:41:45.531000 345036 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0939957Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0940107Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0940388Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0940563Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0940839Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0940953Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0941220Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0945258Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0945560Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0945700Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0945969Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0946135Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0946419Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0946590Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0947110Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3443523584. 2025-12-04T12:44:29.0947220Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0947411Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0947834Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0947942Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0948145Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0948300Z E1204 12:41:45.978000 345033 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0948345Z FAILED [10.3195s] [100%] 2025-12-04T12:44:29.0948348Z 2025-12-04T12:44:29.0948404Z =================================== FAILURES =================================== 2025-12-04T12:44:29.0948553Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.0948605Z Traceback (most recent call last): 2025-12-04T12:44:29.0948781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.0948826Z self._join_processes(fn) 2025-12-04T12:44:29.0949001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.0949056Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.0949236Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.0949282Z raise RuntimeError(error) 2025-12-04T12:44:29.0949362Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:44:29.0949406Z Traceback (most recent call last): 2025-12-04T12:44:29.0949694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0949742Z getattr(self, test_name)() 2025-12-04T12:44:29.0949907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0949942Z fn() 2025-12-04T12:44:29.0950098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0959702Z method(*args, **kwargs) 2025-12-04T12:44:29.0959886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0959984Z method(*args, **kwargs) 2025-12-04T12:44:29.0960143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0960183Z with policy(): 2025-12-04T12:44:29.0960346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0960414Z raise RuntimeError(msg) 2025-12-04T12:44:29.0960822Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3424649216. 2025-12-04T12:44:29.0960826Z 2025-12-04T12:44:29.0960905Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0961213Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0961216Z 2025-12-04T12:44:29.0961306Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0961310Z 2025-12-04T12:44:29.0961371Z Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.0961419Z Traceback (most recent call last): 2025-12-04T12:44:29.0961587Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0961630Z getattr(self, test_name)() 2025-12-04T12:44:29.0961792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0961827Z fn() 2025-12-04T12:44:29.0961979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0962021Z method(*args, **kwargs) 2025-12-04T12:44:29.0962172Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0962211Z method(*args, **kwargs) 2025-12-04T12:44:29.0962381Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0962419Z with policy(): 2025-12-04T12:44:29.0962572Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0962612Z raise RuntimeError(msg) 2025-12-04T12:44:29.0963010Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3368026112. 2025-12-04T12:44:29.0963015Z 2025-12-04T12:44:29.0963090Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0963423Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0963428Z 2025-12-04T12:44:29.0963517Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0963519Z 2025-12-04T12:44:29.0963521Z 2025-12-04T12:44:29.0963600Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.0963689Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.0963961Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-01fc5e09ab395607.xml - 2025-12-04T12:44:29.0964045Z =========================== short test summary info ============================ 2025-12-04T12:44:29.0964365Z FAILED [10.3195s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:44:29.0964425Z Traceback (most recent call last): 2025-12-04T12:44:29.0964590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0964632Z getattr(self, test_name)() 2025-12-04T12:44:29.0964794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0964829Z fn() 2025-12-04T12:44:29.0964979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0965019Z method(*args, **kwargs) 2025-12-04T12:44:29.0965171Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0965210Z method(*args, **kwargs) 2025-12-04T12:44:29.0965362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0965399Z with policy(): 2025-12-04T12:44:29.0965551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0965590Z raise RuntimeError(msg) 2025-12-04T12:44:29.0965992Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3424649216. 2025-12-04T12:44:29.0965995Z 2025-12-04T12:44:29.0966069Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0966368Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0966371Z 2025-12-04T12:44:29.0966467Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0966470Z 2025-12-04T12:44:29.0966529Z Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.0966574Z Traceback (most recent call last): 2025-12-04T12:44:29.0966738Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0966780Z getattr(self, test_name)() 2025-12-04T12:44:29.0966939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0966974Z fn() 2025-12-04T12:44:29.0967124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0967163Z method(*args, **kwargs) 2025-12-04T12:44:29.0967327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0967368Z method(*args, **kwargs) 2025-12-04T12:44:29.0967516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0967553Z with policy(): 2025-12-04T12:44:29.0967702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0967744Z raise RuntimeError(msg) 2025-12-04T12:44:29.0968151Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3368026112. 2025-12-04T12:44:29.0968154Z 2025-12-04T12:44:29.0968248Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0968547Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0968549Z 2025-12-04T12:44:29.0968635Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0968699Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.0968762Z ======================= 1 failed, 7 deselected in 10.33s ======================= 2025-12-04T12:44:29.0968801Z Got exit code 1 2025-12-04T12:44:29.0969051Z FAILED CONSISTENTLY: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda 2025-12-04T12:44:29.0969180Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:44:29.0969407Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-66bc2c5f652db8bb.xml 2025-12-04T12:44:29.0969467Z ============================= test session starts ============================== 2025-12-04T12:44:29.0969615Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.0969657Z cachedir: .pytest_cache 2025-12-04T12:44:29.0969814Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.0969862Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.0969903Z configfile: pytest.ini 2025-12-04T12:44:29.0970069Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.0970144Z collecting ... collected 8 items / 4 deselected / 4 selected 2025-12-04T12:44:29.0970198Z stepcurrent: skipping 4 already run items. 2025-12-04T12:44:29.0970259Z Running 4 items in this shard 2025-12-04T12:44:29.0970261Z 2025-12-04T12:44:29.0970646Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda I1204 12:41:49.797000 345434 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 345503 2025-12-04T12:44:29.0970800Z I1204 12:41:49.798000 345434 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 345504 2025-12-04T12:44:29.0970952Z I1204 12:41:49.798000 345434 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 345505 2025-12-04T12:44:29.0971103Z I1204 12:41:49.799000 345434 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 345506 2025-12-04T12:44:29.0971808Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0971852Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0972517Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0972588Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0973257Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0973299Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0973962Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0974005Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0974500Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0974552Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0975058Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0975106Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0975594Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0975641Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0976139Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0976188Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.0976857Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0976912Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0977578Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0977630Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0978118Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0978182Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.0978669Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0978727Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.0978963Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0979007Z local_shape = tensor.shape 2025-12-04T12:44:29.0979243Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0979280Z tensor.shape, 2025-12-04T12:44:29.0979516Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0979554Z tensor.dtype, 2025-12-04T12:44:29.0979836Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0979880Z local_shape = tensor.shape 2025-12-04T12:44:29.0980112Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0980149Z tensor.shape, 2025-12-04T12:44:29.0980380Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0980416Z tensor.dtype, 2025-12-04T12:44:29.0981104Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0981149Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0981814Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.0981873Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.0982374Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0982430Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.0982911Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.0982968Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.0983205Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0983247Z local_shape = tensor.shape 2025-12-04T12:44:29.0983478Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0983520Z local_shape = tensor.shape 2025-12-04T12:44:29.0983751Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0983788Z tensor.shape, 2025-12-04T12:44:29.0984018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0984055Z tensor.dtype, 2025-12-04T12:44:29.0984295Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0984333Z tensor.shape, 2025-12-04T12:44:29.0984564Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.0984600Z tensor.dtype, 2025-12-04T12:44:29.0984737Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0984896Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0985193Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0985342Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0985624Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0985739Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0986010Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0986163Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0986433Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0986582Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0986852Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0986982Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0987251Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0987393Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0987923Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.0988034Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0988224Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0988669Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0988778Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0988980Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0989139Z E1204 12:41:59.717000 345504 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.0989268Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0989420Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0989751Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0989897Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0990174Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0990304Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0990572Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0990725Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0990995Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0991133Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0991401Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0991530Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0991800Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0991941Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0992466Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.0992575Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0992768Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0993213Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0993320Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0993523Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0993682Z E1204 12:41:59.721000 345506 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.0993820Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0993974Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0994252Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0994399Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0994676Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0994800Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0995073Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0995222Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0995490Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0995628Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.0995897Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.0996025Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.0996295Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.0996434Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.0996954Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3707764736. 2025-12-04T12:44:29.0997063Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0997265Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.0997698Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.0997807Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.0998007Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.0998173Z E1204 12:41:59.739000 345503 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.0998303Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.0998455Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.0998731Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.0998888Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.0999163Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.0999289Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.0999558Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.0999741Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1000011Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1000150Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1000421Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1000548Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1000818Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1000956Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1001481Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 2. CUDA driver allocated memory was 958398464 and is now 3315597312. 2025-12-04T12:44:29.1001601Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1001788Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1002217Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1002324Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1002540Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1002698Z E1204 12:41:59.748000 345505 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1002739Z FAILED [11.4211s] [ 25%] 2025-12-04T12:44:29.1002741Z 2025-12-04T12:44:29.1002799Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1002956Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.1003003Z Traceback (most recent call last): 2025-12-04T12:44:29.1003179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1003223Z self._join_processes(fn) 2025-12-04T12:44:29.1003396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1003463Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1003643Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1003687Z raise RuntimeError(error) 2025-12-04T12:44:29.1003766Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:44:29.1003812Z Traceback (most recent call last): 2025-12-04T12:44:29.1003972Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1004018Z getattr(self, test_name)() 2025-12-04T12:44:29.1004176Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1004210Z fn() 2025-12-04T12:44:29.1004363Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1004406Z method(*args, **kwargs) 2025-12-04T12:44:29.1004558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1004599Z method(*args, **kwargs) 2025-12-04T12:44:29.1004749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1004787Z with policy(): 2025-12-04T12:44:29.1004940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1004981Z raise RuntimeError(msg) 2025-12-04T12:44:29.1005399Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1005402Z 2025-12-04T12:44:29.1005485Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1005800Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1005802Z 2025-12-04T12:44:29.1005889Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1005891Z 2025-12-04T12:44:29.1005950Z Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.1005996Z Traceback (most recent call last): 2025-12-04T12:44:29.1006160Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1006200Z getattr(self, test_name)() 2025-12-04T12:44:29.1006369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1006405Z fn() 2025-12-04T12:44:29.1006556Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1006596Z method(*args, **kwargs) 2025-12-04T12:44:29.1006746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1006964Z method(*args, **kwargs) 2025-12-04T12:44:29.1007116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1007164Z with policy(): 2025-12-04T12:44:29.1007315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1007355Z raise RuntimeError(msg) 2025-12-04T12:44:29.1007765Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1007778Z 2025-12-04T12:44:29.1007853Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1008161Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1008165Z 2025-12-04T12:44:29.1008250Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1008252Z 2025-12-04T12:44:29.1008254Z 2025-12-04T12:44:29.1008330Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1008417Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1008691Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-66bc2c5f652db8bb.xml - 2025-12-04T12:44:29.1008751Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1009075Z FAILED [11.4211s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:44:29.1009121Z Traceback (most recent call last): 2025-12-04T12:44:29.1009286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1009328Z getattr(self, test_name)() 2025-12-04T12:44:29.1009489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1009523Z fn() 2025-12-04T12:44:29.1009719Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1009760Z method(*args, **kwargs) 2025-12-04T12:44:29.1009909Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1009949Z method(*args, **kwargs) 2025-12-04T12:44:29.1010096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1010134Z with policy(): 2025-12-04T12:44:29.1010285Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1010326Z raise RuntimeError(msg) 2025-12-04T12:44:29.1010746Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1010750Z 2025-12-04T12:44:29.1010824Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1011131Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1011146Z 2025-12-04T12:44:29.1011232Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1011234Z 2025-12-04T12:44:29.1011292Z Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.1011336Z Traceback (most recent call last): 2025-12-04T12:44:29.1011500Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1011565Z getattr(self, test_name)() 2025-12-04T12:44:29.1011725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1011759Z fn() 2025-12-04T12:44:29.1011910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1011949Z method(*args, **kwargs) 2025-12-04T12:44:29.1012098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1012137Z method(*args, **kwargs) 2025-12-04T12:44:29.1012289Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1012325Z with policy(): 2025-12-04T12:44:29.1012479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1012518Z raise RuntimeError(msg) 2025-12-04T12:44:29.1012922Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1012925Z 2025-12-04T12:44:29.1012996Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1013306Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1013308Z 2025-12-04T12:44:29.1013396Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1013471Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1013536Z ======================= 1 failed, 4 deselected in 11.43s ======================= 2025-12-04T12:44:29.1013572Z Got exit code 1 2025-12-04T12:44:29.1013612Z Retrying single test... 2025-12-04T12:44:29.1013839Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-641b00e6e5af3cdf.xml 2025-12-04T12:44:29.1013897Z ============================= test session starts ============================== 2025-12-04T12:44:29.1014012Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1014053Z cachedir: .pytest_cache 2025-12-04T12:44:29.1014216Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1014275Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1014315Z configfile: pytest.ini 2025-12-04T12:44:29.1014485Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1014557Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1014861Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1014913Z Running 1 items in this shard 2025-12-04T12:44:29.1014915Z 2025-12-04T12:44:29.1015299Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda I1204 12:42:03.803000 345972 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 346041 2025-12-04T12:44:29.1015466Z I1204 12:42:03.804000 345972 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 346042 2025-12-04T12:44:29.1015617Z I1204 12:42:03.804000 345972 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 346043 2025-12-04T12:44:29.1015770Z I1204 12:42:03.805000 345972 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 346044 2025-12-04T12:44:29.1016451Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1016498Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1017167Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1017210Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1017876Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1017930Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1018591Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1018633Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1019142Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1019194Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1019723Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1019787Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1020273Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1020330Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1020818Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1020866Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1021538Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1021580Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1022248Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1022290Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1022965Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1023008Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1023670Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1023711Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1024219Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1024277Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1024758Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1024828Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1025309Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1025375Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1025853Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1025909Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1026148Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1026193Z local_shape = tensor.shape 2025-12-04T12:44:29.1026428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1026464Z tensor.shape, 2025-12-04T12:44:29.1026697Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1026735Z tensor.dtype, 2025-12-04T12:44:29.1026968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1027010Z local_shape = tensor.shape 2025-12-04T12:44:29.1027251Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1027288Z tensor.shape, 2025-12-04T12:44:29.1027519Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1027555Z tensor.dtype, 2025-12-04T12:44:29.1027784Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1027826Z local_shape = tensor.shape 2025-12-04T12:44:29.1028057Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1028092Z tensor.shape, 2025-12-04T12:44:29.1028342Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1028379Z tensor.dtype, 2025-12-04T12:44:29.1028608Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1028649Z local_shape = tensor.shape 2025-12-04T12:44:29.1028878Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1028936Z tensor.shape, 2025-12-04T12:44:29.1029167Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1029203Z tensor.dtype, 2025-12-04T12:44:29.1029349Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1029506Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1029833Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1029982Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1030262Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1030379Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1030650Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1030790Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1031062Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1031201Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1031471Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1031614Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1031887Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1032028Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1032567Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3946840064. 2025-12-04T12:44:29.1032679Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1032869Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1033300Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1033423Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1033627Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1033798Z E1204 12:42:13.571000 346041 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1033927Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1034079Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1034360Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1034507Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1034784Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1034901Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1035174Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1035313Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1035583Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1035722Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1036006Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1036133Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1036402Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1036542Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1037078Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3584032768. 2025-12-04T12:44:29.1037187Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1037376Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1037807Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1037924Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1038139Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1038296Z E1204 12:42:13.573000 346042 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1038424Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1038578Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1038857Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1039003Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1039283Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1039398Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1039709Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1039850Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1040119Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1040271Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1040539Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1040664Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1040934Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1041073Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1041608Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1041715Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1041902Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1042347Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1042465Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1042670Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1042826Z E1204 12:42:14.000000 346044 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1042955Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1043108Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1043387Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1043537Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1043813Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1043927Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1044195Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1044335Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1044614Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1044755Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1045022Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1045149Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1045435Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1045576Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1046100Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 2. CUDA driver allocated memory was 1262485504 and is now 3315597312. 2025-12-04T12:44:29.1046219Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1046405Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1046833Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1046949Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1047153Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1047308Z E1204 12:42:14.065000 346043 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1047350Z FAILED [11.1205s] [100%] 2025-12-04T12:44:29.1047352Z 2025-12-04T12:44:29.1047408Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1047565Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.1047613Z Traceback (most recent call last): 2025-12-04T12:44:29.1047778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1047821Z self._join_processes(fn) 2025-12-04T12:44:29.1047993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1048049Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1048230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1048274Z raise RuntimeError(error) 2025-12-04T12:44:29.1048353Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1048398Z Traceback (most recent call last): 2025-12-04T12:44:29.1048559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1048613Z getattr(self, test_name)() 2025-12-04T12:44:29.1048774Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1048811Z fn() 2025-12-04T12:44:29.1048963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1049004Z method(*args, **kwargs) 2025-12-04T12:44:29.1049156Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1049197Z method(*args, **kwargs) 2025-12-04T12:44:29.1049350Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1049390Z with policy(): 2025-12-04T12:44:29.1049552Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1049636Z raise RuntimeError(msg) 2025-12-04T12:44:29.1050042Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3946840064. 2025-12-04T12:44:29.1050046Z 2025-12-04T12:44:29.1050121Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1050448Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1050451Z 2025-12-04T12:44:29.1050539Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1050553Z 2025-12-04T12:44:29.1050554Z 2025-12-04T12:44:29.1050631Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1050717Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1050990Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-641b00e6e5af3cdf.xml - 2025-12-04T12:44:29.1051050Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1051374Z FAILED [11.1205s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1051422Z Traceback (most recent call last): 2025-12-04T12:44:29.1051588Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1051634Z getattr(self, test_name)() 2025-12-04T12:44:29.1051793Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1051831Z fn() 2025-12-04T12:44:29.1051983Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1052025Z method(*args, **kwargs) 2025-12-04T12:44:29.1052175Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1052218Z method(*args, **kwargs) 2025-12-04T12:44:29.1052370Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1052409Z with policy(): 2025-12-04T12:44:29.1052562Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1052617Z raise RuntimeError(msg) 2025-12-04T12:44:29.1053018Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3946840064. 2025-12-04T12:44:29.1053020Z 2025-12-04T12:44:29.1053096Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1053407Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1053409Z 2025-12-04T12:44:29.1053507Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1053572Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1053634Z ======================= 1 failed, 7 deselected in 11.13s ======================= 2025-12-04T12:44:29.1053671Z Got exit code 1 2025-12-04T12:44:29.1053710Z Retrying single test... 2025-12-04T12:44:29.1053937Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-caddf5c446c0670f.xml 2025-12-04T12:44:29.1053994Z ============================= test session starts ============================== 2025-12-04T12:44:29.1054119Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1054160Z cachedir: .pytest_cache 2025-12-04T12:44:29.1054319Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1054375Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1054415Z configfile: pytest.ini 2025-12-04T12:44:29.1054579Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1054650Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1054949Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1054993Z Running 1 items in this shard 2025-12-04T12:44:29.1054996Z 2025-12-04T12:44:29.1055378Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda I1204 12:42:17.697000 346510 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 346579 2025-12-04T12:44:29.1055535Z I1204 12:42:17.697000 346510 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 346580 2025-12-04T12:44:29.1055690Z I1204 12:42:17.698000 346510 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 346581 2025-12-04T12:44:29.1055840Z I1204 12:42:17.699000 346510 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 346582 2025-12-04T12:44:29.1056531Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1056575Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1057250Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1057294Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1057966Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1058012Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1058673Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1058726Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1059220Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1059279Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1059797Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1059845Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1060330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1060377Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1060859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1060906Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1061596Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1061639Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1062297Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1062338Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1062833Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1062893Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1063371Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1063442Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1063677Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1063735Z local_shape = tensor.shape 2025-12-04T12:44:29.1063970Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1064009Z tensor.shape, 2025-12-04T12:44:29.1064241Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1064280Z tensor.dtype, 2025-12-04T12:44:29.1064509Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1064553Z local_shape = tensor.shape 2025-12-04T12:44:29.1064785Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1064824Z tensor.shape, 2025-12-04T12:44:29.1065056Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1065091Z tensor.dtype, 2025-12-04T12:44:29.1065760Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1065803Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1066475Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1066517Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1067000Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1067058Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1067548Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1067605Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1067838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1067890Z local_shape = tensor.shape 2025-12-04T12:44:29.1068120Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1068157Z tensor.shape, 2025-12-04T12:44:29.1068389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1068434Z tensor.dtype, 2025-12-04T12:44:29.1068666Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1068708Z local_shape = tensor.shape 2025-12-04T12:44:29.1068940Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1068978Z tensor.shape, 2025-12-04T12:44:29.1069209Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1069244Z tensor.dtype, 2025-12-04T12:44:29.1069381Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1069536Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1069873Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1070020Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1070299Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1070417Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1070701Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1070841Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1071107Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1071248Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1071527Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1071660Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1071929Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1072068Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1072610Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1072732Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1072921Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1073348Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1073459Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1073666Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1073825Z E1204 12:42:27.496000 346580 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1073955Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1074105Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1074386Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1074532Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1074809Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1074933Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1075205Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1075347Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1075616Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1075762Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1076043Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1076170Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1076440Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1076591Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1077118Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 2. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1077241Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1077429Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1077855Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1077962Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1078168Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1078327Z E1204 12:42:27.498000 346581 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1078457Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1078608Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1078888Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1079034Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1079325Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1079438Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1079746Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1079885Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1080165Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1080309Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1080574Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1080703Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1080974Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1081130Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1081655Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 952107008 and is now 3315597312. 2025-12-04T12:44:29.1081773Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1081962Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1082397Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1082509Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1082709Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1082867Z E1204 12:42:27.553000 346582 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1082997Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1083150Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1083432Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1083593Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1083868Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1083982Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1084252Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1084391Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1084675Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1084815Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1085082Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1085221Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1085491Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1085643Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1086163Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3705667584. 2025-12-04T12:44:29.1086272Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1086462Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1086891Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1087001Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1087200Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1087358Z E1204 12:42:27.564000 346579 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1087399Z FAILED [11.2209s] [100%] 2025-12-04T12:44:29.1087401Z 2025-12-04T12:44:29.1087458Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1087613Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda _ 2025-12-04T12:44:29.1087662Z Traceback (most recent call last): 2025-12-04T12:44:29.1087834Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1087877Z self._join_processes(fn) 2025-12-04T12:44:29.1088049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1088102Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1088282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1088326Z raise RuntimeError(error) 2025-12-04T12:44:29.1088406Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:44:29.1088450Z Traceback (most recent call last): 2025-12-04T12:44:29.1088623Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1088667Z getattr(self, test_name)() 2025-12-04T12:44:29.1088826Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1088860Z fn() 2025-12-04T12:44:29.1089011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1089051Z method(*args, **kwargs) 2025-12-04T12:44:29.1089202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1089253Z method(*args, **kwargs) 2025-12-04T12:44:29.1089403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1089439Z with policy(): 2025-12-04T12:44:29.1089622Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1089679Z raise RuntimeError(msg) 2025-12-04T12:44:29.1090088Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1090090Z 2025-12-04T12:44:29.1090164Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1090473Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1090476Z 2025-12-04T12:44:29.1090563Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1090567Z 2025-12-04T12:44:29.1090626Z Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.1090673Z Traceback (most recent call last): 2025-12-04T12:44:29.1090836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1090878Z getattr(self, test_name)() 2025-12-04T12:44:29.1091042Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1091078Z fn() 2025-12-04T12:44:29.1091230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1091270Z method(*args, **kwargs) 2025-12-04T12:44:29.1091419Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1091459Z method(*args, **kwargs) 2025-12-04T12:44:29.1091628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1091667Z with policy(): 2025-12-04T12:44:29.1091817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1091858Z raise RuntimeError(msg) 2025-12-04T12:44:29.1092265Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 2. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1092268Z 2025-12-04T12:44:29.1092341Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1092665Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1092669Z 2025-12-04T12:44:29.1092756Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1092758Z 2025-12-04T12:44:29.1092815Z Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.1092859Z Traceback (most recent call last): 2025-12-04T12:44:29.1093023Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1093089Z getattr(self, test_name)() 2025-12-04T12:44:29.1093249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1093283Z fn() 2025-12-04T12:44:29.1093436Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1093489Z method(*args, **kwargs) 2025-12-04T12:44:29.1093640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1093679Z method(*args, **kwargs) 2025-12-04T12:44:29.1093828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1093865Z with policy(): 2025-12-04T12:44:29.1094016Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1094060Z raise RuntimeError(msg) 2025-12-04T12:44:29.1094466Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 952107008 and is now 3315597312. 2025-12-04T12:44:29.1094470Z 2025-12-04T12:44:29.1094545Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1094858Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1094860Z 2025-12-04T12:44:29.1094949Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1094951Z 2025-12-04T12:44:29.1094953Z 2025-12-04T12:44:29.1095030Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1095117Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1095393Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-caddf5c446c0670f.xml - 2025-12-04T12:44:29.1095454Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1095789Z FAILED [11.2209s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T12:44:29.1095836Z Traceback (most recent call last): 2025-12-04T12:44:29.1096001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1096043Z getattr(self, test_name)() 2025-12-04T12:44:29.1096204Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1096238Z fn() 2025-12-04T12:44:29.1096388Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1096441Z method(*args, **kwargs) 2025-12-04T12:44:29.1096596Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1096635Z method(*args, **kwargs) 2025-12-04T12:44:29.1096789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1096827Z with policy(): 2025-12-04T12:44:29.1096978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1097031Z raise RuntimeError(msg) 2025-12-04T12:44:29.1097438Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1097450Z 2025-12-04T12:44:29.1097523Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1097829Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1097831Z 2025-12-04T12:44:29.1097918Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1097921Z 2025-12-04T12:44:29.1097978Z Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.1098025Z Traceback (most recent call last): 2025-12-04T12:44:29.1098187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1098228Z getattr(self, test_name)() 2025-12-04T12:44:29.1098390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1098426Z fn() 2025-12-04T12:44:29.1098577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1098615Z method(*args, **kwargs) 2025-12-04T12:44:29.1098767Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1098806Z method(*args, **kwargs) 2025-12-04T12:44:29.1098954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1098990Z with policy(): 2025-12-04T12:44:29.1099143Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1099184Z raise RuntimeError(msg) 2025-12-04T12:44:29.1099645Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 2. CUDA driver allocated memory was 1268776960 and is now 3315597312. 2025-12-04T12:44:29.1099649Z 2025-12-04T12:44:29.1099724Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1100032Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1100036Z 2025-12-04T12:44:29.1100123Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1100125Z 2025-12-04T12:44:29.1100182Z Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.1100228Z Traceback (most recent call last): 2025-12-04T12:44:29.1100401Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1100445Z getattr(self, test_name)() 2025-12-04T12:44:29.1100604Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1100641Z fn() 2025-12-04T12:44:29.1100789Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1100830Z method(*args, **kwargs) 2025-12-04T12:44:29.1100978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1101035Z method(*args, **kwargs) 2025-12-04T12:44:29.1101185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1101221Z with policy(): 2025-12-04T12:44:29.1101375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1101429Z raise RuntimeError(msg) 2025-12-04T12:44:29.1101832Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 952107008 and is now 3315597312. 2025-12-04T12:44:29.1101835Z 2025-12-04T12:44:29.1101908Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1102217Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1102219Z 2025-12-04T12:44:29.1102304Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1102370Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1102432Z ======================= 1 failed, 7 deselected in 11.23s ======================= 2025-12-04T12:44:29.1102471Z Got exit code 1 2025-12-04T12:44:29.1102732Z FAILED CONSISTENTLY: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda 2025-12-04T12:44:29.1102864Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:44:29.1103094Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-d1832f138396539a.xml 2025-12-04T12:44:29.1103154Z ============================= test session starts ============================== 2025-12-04T12:44:29.1103267Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1103310Z cachedir: .pytest_cache 2025-12-04T12:44:29.1103479Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1103528Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1103569Z configfile: pytest.ini 2025-12-04T12:44:29.1103736Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1103806Z collecting ... collected 8 items / 5 deselected / 3 selected 2025-12-04T12:44:29.1103860Z stepcurrent: skipping 5 already run items. 2025-12-04T12:44:29.1103905Z Running 3 items in this shard 2025-12-04T12:44:29.1103907Z 2025-12-04T12:44:29.1104296Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda I1204 12:42:31.257000 347048 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 347117 2025-12-04T12:44:29.1104455Z I1204 12:42:31.257000 347048 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 347118 2025-12-04T12:44:29.1104606Z I1204 12:42:31.258000 347048 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 347119 2025-12-04T12:44:29.1104756Z I1204 12:42:31.259000 347048 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 347120 2025-12-04T12:44:29.1105440Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1105507Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1106176Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1106220Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1106885Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1106927Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1107591Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1107635Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1108146Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1108199Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1108681Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1108732Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1109239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1109286Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1109897Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1109961Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1110633Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1110689Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1111352Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1111397Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1111882Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1111942Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1112424Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1112482Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1113165Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1113207Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1113867Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1113923Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1114409Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1114468Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1114948Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1115016Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1115263Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1115306Z local_shape = tensor.shape 2025-12-04T12:44:29.1115542Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1115579Z tensor.shape, 2025-12-04T12:44:29.1115813Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1115850Z tensor.dtype, 2025-12-04T12:44:29.1116082Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1116125Z local_shape = tensor.shape 2025-12-04T12:44:29.1116358Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1116393Z tensor.shape, 2025-12-04T12:44:29.1116623Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1116658Z tensor.dtype, 2025-12-04T12:44:29.1116892Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1116934Z local_shape = tensor.shape 2025-12-04T12:44:29.1117165Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1117203Z tensor.shape, 2025-12-04T12:44:29.1117441Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1117479Z tensor.dtype, 2025-12-04T12:44:29.1117709Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1117751Z local_shape = tensor.shape 2025-12-04T12:44:29.1117984Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1118022Z tensor.shape, 2025-12-04T12:44:29.1118264Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1118302Z tensor.dtype, 2025-12-04T12:44:29.1118438Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1118593Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1118877Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1119034Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1119314Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1119440Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1119753Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1119893Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1120163Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1120302Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1120572Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1120701Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1120973Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1121114Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1121642Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3682598912. 2025-12-04T12:44:29.1121767Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1121955Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1122386Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1122494Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1122710Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1122871Z E1204 12:42:40.524000 347117 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1123000Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1123154Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1123434Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1123594Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1123872Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1124000Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1124269Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1124407Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1124676Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1124814Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1125084Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1125212Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1125483Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1125623Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1126163Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 2. CUDA driver allocated memory was 1262485504 and is now 3292528640. 2025-12-04T12:44:29.1126271Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1126459Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1126890Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1127005Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1127209Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1127368Z E1204 12:42:40.600000 347119 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1127497Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1127649Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1127939Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1128086Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1128370Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1128485Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1128754Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1128893Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1129163Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1129302Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1129616Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1129741Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1130011Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1130152Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1130692Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 1268776960 and is now 3292528640. 2025-12-04T12:44:29.1130801Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1130990Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1131433Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1131542Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1131749Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1131908Z E1204 12:42:40.616000 347120 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1132047Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1132199Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1132480Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1132640Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1132916Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1133030Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1133299Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1133440Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1133711Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1133848Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1134116Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1134242Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1134517Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1134667Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1135186Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3292528640. 2025-12-04T12:44:29.1135294Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1135483Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1135916Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1136023Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1136223Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1136390Z E1204 12:42:40.616000 347118 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1136430Z FAILED [10.6227s] [ 33%] 2025-12-04T12:44:29.1136433Z 2025-12-04T12:44:29.1136489Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1136656Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.1136703Z Traceback (most recent call last): 2025-12-04T12:44:29.1136869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1136913Z self._join_processes(fn) 2025-12-04T12:44:29.1137085Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1137141Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1137323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1137369Z raise RuntimeError(error) 2025-12-04T12:44:29.1137447Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1137495Z Traceback (most recent call last): 2025-12-04T12:44:29.1137660Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1137705Z getattr(self, test_name)() 2025-12-04T12:44:29.1137864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1137900Z fn() 2025-12-04T12:44:29.1138051Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1138095Z method(*args, **kwargs) 2025-12-04T12:44:29.1138248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1138293Z method(*args, **kwargs) 2025-12-04T12:44:29.1138444Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1138488Z with policy(): 2025-12-04T12:44:29.1138653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1138698Z raise RuntimeError(msg) 2025-12-04T12:44:29.1139135Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3682598912. 2025-12-04T12:44:29.1139138Z 2025-12-04T12:44:29.1139231Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1139564Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1139566Z 2025-12-04T12:44:29.1139725Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1139727Z 2025-12-04T12:44:29.1139730Z 2025-12-04T12:44:29.1139829Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1139970Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1140276Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-d1832f138396539a.xml - 2025-12-04T12:44:29.1140346Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1140708Z FAILED [10.6227s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1140764Z Traceback (most recent call last): 2025-12-04T12:44:29.1140978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1141036Z getattr(self, test_name)() 2025-12-04T12:44:29.1141218Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1141263Z fn() 2025-12-04T12:44:29.1141442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1141487Z method(*args, **kwargs) 2025-12-04T12:44:29.1141677Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1141728Z method(*args, **kwargs) 2025-12-04T12:44:29.1141901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1141950Z with policy(): 2025-12-04T12:44:29.1142129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1142209Z raise RuntimeError(msg) 2025-12-04T12:44:29.1142629Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3682598912. 2025-12-04T12:44:29.1142631Z 2025-12-04T12:44:29.1142730Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1143053Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1143056Z 2025-12-04T12:44:29.1143161Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1143260Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1143351Z ======================= 1 failed, 5 deselected in 10.63s ======================= 2025-12-04T12:44:29.1143401Z Got exit code 1 2025-12-04T12:44:29.1143470Z Retrying single test... 2025-12-04T12:44:29.1143707Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-824ccfd58c5f3ae3.xml 2025-12-04T12:44:29.1143797Z ============================= test session starts ============================== 2025-12-04T12:44:29.1143939Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1143992Z cachedir: .pytest_cache 2025-12-04T12:44:29.1144190Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1144249Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1144320Z configfile: pytest.ini 2025-12-04T12:44:29.1144500Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1144605Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1144918Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1144998Z Running 1 items in this shard 2025-12-04T12:44:29.1145000Z 2025-12-04T12:44:29.1145389Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda I1204 12:42:44.624000 347586 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 347655 2025-12-04T12:44:29.1145606Z I1204 12:42:44.625000 347586 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 347656 2025-12-04T12:44:29.1145785Z I1204 12:42:44.625000 347586 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 347657 2025-12-04T12:44:29.1145948Z I1204 12:42:44.626000 347586 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 347658 2025-12-04T12:44:29.1146650Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1146703Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1147413Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1147482Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1148168Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1148240Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1148909Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1148997Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1149526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1149659Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1150165Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1150244Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1150768Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1150852Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1151349Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1151425Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1152107Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1152189Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1152881Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1152933Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1153464Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1153558Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1154055Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1154139Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1154837Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1154907Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1155594Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1155664Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1156192Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1156259Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1156769Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1156856Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1157117Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1157184Z local_shape = tensor.shape 2025-12-04T12:44:29.1157434Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1157495Z tensor.shape, 2025-12-04T12:44:29.1157734Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1159563Z tensor.dtype, 2025-12-04T12:44:29.1159853Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1159922Z local_shape = tensor.shape 2025-12-04T12:44:29.1160184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1160240Z tensor.shape, 2025-12-04T12:44:29.1160501Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1160568Z tensor.dtype, 2025-12-04T12:44:29.1160822Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1160881Z local_shape = tensor.shape 2025-12-04T12:44:29.1161129Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1161210Z local_shape = tensor.shape 2025-12-04T12:44:29.1161471Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1161518Z tensor.shape, 2025-12-04T12:44:29.1161771Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1161825Z tensor.dtype, 2025-12-04T12:44:29.1162088Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1162158Z tensor.shape, 2025-12-04T12:44:29.1162413Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1162474Z tensor.dtype, 2025-12-04T12:44:29.1162641Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1162825Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1163121Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1163289Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1163588Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1163728Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1164004Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1164188Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1164482Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1164639Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1164929Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1165077Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1165386Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1165536Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1166104Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 1268776960 and is now 3477078016. 2025-12-04T12:44:29.1166237Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1166433Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1166899Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1167037Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1167266Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1167445Z E1204 12:42:53.925000 347658 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1167593Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1167791Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1168087Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1168255Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1168552Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1168685Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1168983Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1169153Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1169447Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1169640Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1169944Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1170100Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1170401Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1170551Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1171115Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 2. CUDA driver allocated memory was 1268776960 and is now 3542089728. 2025-12-04T12:44:29.1171247Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1171458Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1171914Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1172048Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1172292Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1172467Z E1204 12:42:53.984000 347657 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1172617Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1172796Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1173089Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1173266Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1173561Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1173710Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1173995Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1174165Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1174455Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1174617Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1174925Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1175069Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1175372Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1175531Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1176078Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3676307456. 2025-12-04T12:44:29.1176213Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1176425Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1176887Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1177019Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1177243Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1177434Z E1204 12:42:54.491000 347655 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1177584Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1177759Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1178051Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1178222Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1178505Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1178661Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1178954Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1179107Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1179422Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1179565Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1179949Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1180090Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1180404Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1180574Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1181101Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3292528640. 2025-12-04T12:44:29.1181268Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1181469Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1181926Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1182069Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1182276Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1182475Z E1204 12:42:54.493000 347656 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1182526Z FAILED [10.7208s] [100%] 2025-12-04T12:44:29.1182528Z 2025-12-04T12:44:29.1182612Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1182779Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.1182845Z Traceback (most recent call last): 2025-12-04T12:44:29.1183031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1183102Z self._join_processes(fn) 2025-12-04T12:44:29.1183294Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1183370Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1183564Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1183639Z raise RuntimeError(error) 2025-12-04T12:44:29.1183734Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.1183809Z Traceback (most recent call last): 2025-12-04T12:44:29.1184011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1184067Z getattr(self, test_name)() 2025-12-04T12:44:29.1184260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1184316Z fn() 2025-12-04T12:44:29.1184496Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1184548Z method(*args, **kwargs) 2025-12-04T12:44:29.1184728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1184773Z method(*args, **kwargs) 2025-12-04T12:44:29.1184976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1185023Z with policy(): 2025-12-04T12:44:29.1185220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1185271Z raise RuntimeError(msg) 2025-12-04T12:44:29.1185694Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 1268776960 and is now 3477078016. 2025-12-04T12:44:29.1185697Z 2025-12-04T12:44:29.1185799Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1186150Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1186153Z 2025-12-04T12:44:29.1186263Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1186275Z 2025-12-04T12:44:29.1186277Z 2025-12-04T12:44:29.1186366Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1186477Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1186772Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-824ccfd58c5f3ae3.xml - 2025-12-04T12:44:29.1186862Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1187194Z FAILED [10.7208s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.1187264Z Traceback (most recent call last): 2025-12-04T12:44:29.1187455Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1187520Z getattr(self, test_name)() 2025-12-04T12:44:29.1187709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1187755Z fn() 2025-12-04T12:44:29.1187934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1187985Z method(*args, **kwargs) 2025-12-04T12:44:29.1188167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1188224Z method(*args, **kwargs) 2025-12-04T12:44:29.1188400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1188448Z with policy(): 2025-12-04T12:44:29.1188641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1188687Z raise RuntimeError(msg) 2025-12-04T12:44:29.1189138Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 1268776960 and is now 3477078016. 2025-12-04T12:44:29.1189141Z 2025-12-04T12:44:29.1189237Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1189564Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1189566Z 2025-12-04T12:44:29.1189743Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1189815Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1189915Z ======================= 1 failed, 7 deselected in 10.73s ======================= 2025-12-04T12:44:29.1189963Z Got exit code 1 2025-12-04T12:44:29.1190032Z Retrying single test... 2025-12-04T12:44:29.1190268Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-d88bd703a098f5c0.xml 2025-12-04T12:44:29.1190344Z ============================= test session starts ============================== 2025-12-04T12:44:29.1190497Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1190581Z cachedir: .pytest_cache 2025-12-04T12:44:29.1190750Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1190838Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1190891Z configfile: pytest.ini 2025-12-04T12:44:29.1191091Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1191198Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1191509Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1191576Z Running 1 items in this shard 2025-12-04T12:44:29.1191578Z 2025-12-04T12:44:29.1191970Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda I1204 12:42:58.134000 348124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 348193 2025-12-04T12:44:29.1192157Z I1204 12:42:58.134000 348124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 348194 2025-12-04T12:44:29.1192330Z I1204 12:42:58.135000 348124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 348195 2025-12-04T12:44:29.1192504Z I1204 12:42:58.135000 348124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 348196 2025-12-04T12:44:29.1193206Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1193262Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1193983Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1194042Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1194740Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1194807Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1195482Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:118: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1195565Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1196076Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1196156Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1196673Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1196734Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1197246Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1197311Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1197814Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1197890Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1198587Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1198662Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1199355Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1199413Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1199987Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1200059Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1200574Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1200663Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1201358Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1201435Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1202107Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:130: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1202178Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1202702Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1202770Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1203275Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1203345Z distributed_c10d._get_pg_default_device(pg).type 2025-12-04T12:44:29.1203631Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1203692Z local_shape = tensor.shape 2025-12-04T12:44:29.1203954Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1204013Z tensor.shape, 2025-12-04T12:44:29.1204255Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1204329Z tensor.dtype, 2025-12-04T12:44:29.1204575Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1204641Z local_shape = tensor.shape 2025-12-04T12:44:29.1204896Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1204961Z tensor.shape, 2025-12-04T12:44:29.1205198Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1205278Z tensor.dtype, 2025-12-04T12:44:29.1205523Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1205605Z local_shape = tensor.shape 2025-12-04T12:44:29.1205851Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1205906Z tensor.shape, 2025-12-04T12:44:29.1206195Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1206242Z tensor.dtype, 2025-12-04T12:44:29.1206503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1206555Z local_shape = tensor.shape 2025-12-04T12:44:29.1206804Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1206869Z tensor.shape, 2025-12-04T12:44:29.1207129Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor. 2025-12-04T12:44:29.1207177Z tensor.dtype, 2025-12-04T12:44:29.1207343Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1207511Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1207837Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1208000Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1208311Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1208453Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1208742Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1214579Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1214885Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1215039Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1215368Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1215509Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1215794Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1215940Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1216493Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3839885312. 2025-12-04T12:44:29.1216624Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1216817Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1217253Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1217366Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1217573Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1217737Z E1204 12:43:07.432000 348193 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1217870Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1218027Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1218309Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1218464Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1218748Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1218882Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1219156Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1219298Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1219632Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1219787Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1220060Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1220192Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1220466Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1220625Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1221153Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 3. CUDA driver allocated memory was 803209216 and is now 3393191936. 2025-12-04T12:44:29.1221284Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1221476Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1221911Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1222025Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1222233Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1222393Z E1204 12:43:07.510000 348196 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1222526Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1222680Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1222961Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1223113Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1223405Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1223526Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1223797Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1223939Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1224220Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1224363Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1224636Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1224765Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1225051Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1225192Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1225736Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 2. CUDA driver allocated memory was 1268776960 and is now 3458203648. 2025-12-04T12:44:29.1225847Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1226035Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1226467Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1226578Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1226784Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1226943Z E1204 12:43:07.522000 348195 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1227075Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1227230Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1227509Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1227667Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1227947Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1228068Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1228339Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1228482Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1228764Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1228905Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1229176Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1229316Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1229626Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1229789Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1230313Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 1. CUDA driver allocated memory was 1268776960 and is now 3292528640. 2025-12-04T12:44:29.1230424Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1230613Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1231050Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1231158Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1231362Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1231520Z E1204 12:43:07.973000 348194 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1231567Z FAILED [10.6224s] [100%] 2025-12-04T12:44:29.1231570Z 2025-12-04T12:44:29.1231631Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1231790Z _ TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda _ 2025-12-04T12:44:29.1231844Z Traceback (most recent call last): 2025-12-04T12:44:29.1232023Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1232071Z self._join_processes(fn) 2025-12-04T12:44:29.1232245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1232303Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1232482Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1232531Z raise RuntimeError(error) 2025-12-04T12:44:29.1232613Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1232661Z Traceback (most recent call last): 2025-12-04T12:44:29.1253426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1253484Z getattr(self, test_name)() 2025-12-04T12:44:29.1253648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1253683Z fn() 2025-12-04T12:44:29.1253837Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1253880Z method(*args, **kwargs) 2025-12-04T12:44:29.1254029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1254087Z method(*args, **kwargs) 2025-12-04T12:44:29.1254237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1254275Z with policy(): 2025-12-04T12:44:29.1254429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1254484Z raise RuntimeError(msg) 2025-12-04T12:44:29.1254894Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3839885312. 2025-12-04T12:44:29.1254897Z 2025-12-04T12:44:29.1254974Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1255286Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1255288Z 2025-12-04T12:44:29.1255377Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1255380Z 2025-12-04T12:44:29.1255382Z 2025-12-04T12:44:29.1255462Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1255548Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1255825Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-d88bd703a098f5c0.xml - 2025-12-04T12:44:29.1255887Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1256224Z FAILED [10.6224s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1256272Z Traceback (most recent call last): 2025-12-04T12:44:29.1256455Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1256520Z getattr(self, test_name)() 2025-12-04T12:44:29.1256681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1256720Z fn() 2025-12-04T12:44:29.1256871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1256912Z method(*args, **kwargs) 2025-12-04T12:44:29.1257061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1257101Z method(*args, **kwargs) 2025-12-04T12:44:29.1257250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1257287Z with policy(): 2025-12-04T12:44:29.1257450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1257495Z raise RuntimeError(msg) 2025-12-04T12:44:29.1257903Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda! Caching allocator allocated memory was 0 and is now reported as 27136 on device 0. CUDA driver allocated memory was 1438646272 and is now 3839885312. 2025-12-04T12:44:29.1257906Z 2025-12-04T12:44:29.1257980Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1258302Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1258304Z 2025-12-04T12:44:29.1258390Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1258464Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1258528Z ======================= 1 failed, 7 deselected in 10.63s ======================= 2025-12-04T12:44:29.1258565Z Got exit code 1 2025-12-04T12:44:29.1258824Z FAILED CONSISTENTLY: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda 2025-12-04T12:44:29.1258953Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:44:29.1259179Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-cc44fcc8e549fc22.xml 2025-12-04T12:44:29.1259238Z ============================= test session starts ============================== 2025-12-04T12:44:29.1259353Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1259395Z cachedir: .pytest_cache 2025-12-04T12:44:29.1259553Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1259722Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1259762Z configfile: pytest.ini 2025-12-04T12:44:29.1259927Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1259999Z collecting ... collected 8 items / 6 deselected / 2 selected 2025-12-04T12:44:29.1260053Z stepcurrent: skipping 6 already run items. 2025-12-04T12:44:29.1260097Z Running 2 items in this shard 2025-12-04T12:44:29.1260099Z 2025-12-04T12:44:29.1260433Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda I1204 12:43:11.574000 348662 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 348731 2025-12-04T12:44:29.1260607Z I1204 12:43:11.575000 348662 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 348732 2025-12-04T12:44:29.1260758Z I1204 12:43:11.575000 348662 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 348733 2025-12-04T12:44:29.1260909Z I1204 12:43:11.576000 348662 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 348734 2025-12-04T12:44:29.1261616Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1261664Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1262333Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1262400Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1263062Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1263115Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1263775Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1263817Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1264313Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1264364Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1264848Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1264897Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1265391Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1265437Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1265916Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1265962Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1266098Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1266263Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1266549Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1266696Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1266972Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1267114Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1267383Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1267538Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1267805Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1267945Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1268214Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1268341Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1268613Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1268753Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1269237Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3210739712. 2025-12-04T12:44:29.1269350Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1269543Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1269985Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1270097Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1270300Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1270460Z E1204 12:43:20.472000 348731 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1270593Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1270775Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1271056Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1271201Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1271482Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1271616Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1271890Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1272047Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1272316Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1272457Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1272725Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1272855Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1273125Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1273269Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1273746Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3051356160. 2025-12-04T12:44:29.1273855Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1274060Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1274437Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1274546Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1274750Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1274909Z E1204 12:43:20.487000 348732 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1275049Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1275208Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1275487Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1275633Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1275923Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1276037Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1276321Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1276460Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1276729Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1276870Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1277137Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1277268Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1277539Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1277681Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1278152Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1256194048 and is now 3051356160. 2025-12-04T12:44:29.1278264Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1278464Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1278836Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1278945Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1279147Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1279315Z E1204 12:43:20.539000 348734 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1279447Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1279644Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1279925Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1280089Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1280369Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1280501Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1280773Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1280911Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1281180Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1281320Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1281590Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1281720Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1281988Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1282129Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1282602Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3051356160. 2025-12-04T12:44:29.1282728Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1282917Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1283292Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1283403Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1283633Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1283795Z E1204 12:43:20.543000 348733 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1283836Z FAILED [10.3192s] [ 50%] 2025-12-04T12:44:29.1283838Z 2025-12-04T12:44:29.1283897Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1284004Z __ TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda ___ 2025-12-04T12:44:29.1284052Z Traceback (most recent call last): 2025-12-04T12:44:29.1284215Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1284272Z self._join_processes(fn) 2025-12-04T12:44:29.1284444Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1284501Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1284694Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1284743Z raise RuntimeError(error) 2025-12-04T12:44:29.1284823Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1284873Z Traceback (most recent call last): 2025-12-04T12:44:29.1285040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1285086Z getattr(self, test_name)() 2025-12-04T12:44:29.1285249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1285285Z fn() 2025-12-04T12:44:29.1285439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1285480Z method(*args, **kwargs) 2025-12-04T12:44:29.1285634Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1285675Z method(*args, **kwargs) 2025-12-04T12:44:29.1285828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1285866Z with policy(): 2025-12-04T12:44:29.1286021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1286062Z raise RuntimeError(msg) 2025-12-04T12:44:29.1286417Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3210739712. 2025-12-04T12:44:29.1286419Z 2025-12-04T12:44:29.1286495Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1286766Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1286769Z 2025-12-04T12:44:29.1286856Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1286861Z 2025-12-04T12:44:29.1286862Z 2025-12-04T12:44:29.1286937Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1287026Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1287300Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-cc44fcc8e549fc22.xml - 2025-12-04T12:44:29.1287363Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1287648Z FAILED [10.3192s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1287699Z Traceback (most recent call last): 2025-12-04T12:44:29.1287864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1287909Z getattr(self, test_name)() 2025-12-04T12:44:29.1288068Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1288117Z fn() 2025-12-04T12:44:29.1288267Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1288309Z method(*args, **kwargs) 2025-12-04T12:44:29.1288461Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1288513Z method(*args, **kwargs) 2025-12-04T12:44:29.1288664Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1288705Z with policy(): 2025-12-04T12:44:29.1288857Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1288900Z raise RuntimeError(msg) 2025-12-04T12:44:29.1289256Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3210739712. 2025-12-04T12:44:29.1289259Z 2025-12-04T12:44:29.1289334Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1289635Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1289639Z 2025-12-04T12:44:29.1289726Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1289795Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1289857Z ======================= 1 failed, 6 deselected in 10.33s ======================= 2025-12-04T12:44:29.1289897Z Got exit code 1 2025-12-04T12:44:29.1289940Z Retrying single test... 2025-12-04T12:44:29.1290168Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-2cdbbc51773bb60a.xml 2025-12-04T12:44:29.1290227Z ============================= test session starts ============================== 2025-12-04T12:44:29.1290344Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1290386Z cachedir: .pytest_cache 2025-12-04T12:44:29.1290561Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1290610Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1290650Z configfile: pytest.ini 2025-12-04T12:44:29.1290818Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1290891Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1291144Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1291188Z Running 1 items in this shard 2025-12-04T12:44:29.1291191Z 2025-12-04T12:44:29.1291543Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda I1204 12:43:24.455000 349132 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 349201 2025-12-04T12:44:29.1291700Z I1204 12:43:24.456000 349132 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 349202 2025-12-04T12:44:29.1291854Z I1204 12:43:24.456000 349132 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 349203 2025-12-04T12:44:29.1292004Z I1204 12:43:24.457000 349132 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 349204 2025-12-04T12:44:29.1292702Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1292762Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1293422Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1293469Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1294126Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1294171Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1294831Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1294872Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1295381Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1295432Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1295923Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1295975Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1296474Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1296525Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1297008Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1297073Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1297208Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1297364Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1297663Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1297808Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1298088Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1298204Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1298481Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1298623Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1298894Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1299037Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1299306Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1299438Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1299772Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1299915Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1300393Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1260388352 and is now 3051356160. 2025-12-04T12:44:29.1300505Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1300712Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1301087Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1301196Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1301400Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1301576Z E1204 12:43:33.177000 349204 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1301707Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1301877Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1302158Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1302304Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1302584Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1302698Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1302969Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1303107Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1303376Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1303516Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1303787Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1303928Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1304198Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1304340Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1304812Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3051356160. 2025-12-04T12:44:29.1304934Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1305127Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1305503Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1305611Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1305824Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1305984Z E1204 12:43:33.179000 349203 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1306125Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1306277Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1306555Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1306704Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1306982Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1307097Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1307366Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1307504Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1307775Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1307914Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1308184Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1308324Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1308595Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1308735Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1309217Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3206545408. 2025-12-04T12:44:29.1309330Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1309520Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1309936Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1310057Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1310259Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1310435Z E1204 12:43:33.218000 349201 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1310564Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1310716Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1310993Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1311141Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1311418Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1311537Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1311810Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1311953Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1312223Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1312361Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1312650Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1312777Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1313048Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1313188Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1313673Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3051356160. 2025-12-04T12:44:29.1313784Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1313973Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1314348Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1314466Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1314672Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1314854Z E1204 12:43:33.266000 349202 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1314896Z FAILED [10.0198s] [100%] 2025-12-04T12:44:29.1314898Z 2025-12-04T12:44:29.1314957Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1315066Z __ TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda ___ 2025-12-04T12:44:29.1315118Z Traceback (most recent call last): 2025-12-04T12:44:29.1315284Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1315331Z self._join_processes(fn) 2025-12-04T12:44:29.1315504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1315565Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1315747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1315793Z raise RuntimeError(error) 2025-12-04T12:44:29.1315873Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.1315923Z Traceback (most recent call last): 2025-12-04T12:44:29.1316085Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1316133Z getattr(self, test_name)() 2025-12-04T12:44:29.1316293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1316331Z fn() 2025-12-04T12:44:29.1316485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1316531Z method(*args, **kwargs) 2025-12-04T12:44:29.1316697Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1316740Z method(*args, **kwargs) 2025-12-04T12:44:29.1316892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1316932Z with policy(): 2025-12-04T12:44:29.1317087Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1317133Z raise RuntimeError(msg) 2025-12-04T12:44:29.1317492Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1260388352 and is now 3051356160. 2025-12-04T12:44:29.1317495Z 2025-12-04T12:44:29.1317582Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1317845Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1317847Z 2025-12-04T12:44:29.1317936Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1317938Z 2025-12-04T12:44:29.1317940Z 2025-12-04T12:44:29.1318018Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1318120Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1318396Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-2cdbbc51773bb60a.xml - 2025-12-04T12:44:29.1318459Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1318742Z FAILED [10.0198s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T12:44:29.1318790Z Traceback (most recent call last): 2025-12-04T12:44:29.1318956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1319003Z getattr(self, test_name)() 2025-12-04T12:44:29.1319162Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1319202Z fn() 2025-12-04T12:44:29.1319353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1319397Z method(*args, **kwargs) 2025-12-04T12:44:29.1319550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1319622Z method(*args, **kwargs) 2025-12-04T12:44:29.1319773Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1319813Z with policy(): 2025-12-04T12:44:29.1319965Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1320009Z raise RuntimeError(msg) 2025-12-04T12:44:29.1320360Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1260388352 and is now 3051356160. 2025-12-04T12:44:29.1320366Z 2025-12-04T12:44:29.1320439Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1320713Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1320715Z 2025-12-04T12:44:29.1320802Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1320867Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1320931Z ======================= 1 failed, 7 deselected in 10.03s ======================= 2025-12-04T12:44:29.1320970Z Got exit code 1 2025-12-04T12:44:29.1321013Z Retrying single test... 2025-12-04T12:44:29.1321241Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-66cb2d4bdeb6a169.xml 2025-12-04T12:44:29.1321299Z ============================= test session starts ============================== 2025-12-04T12:44:29.1321427Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1321471Z cachedir: .pytest_cache 2025-12-04T12:44:29.1321634Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1321680Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1321724Z configfile: pytest.ini 2025-12-04T12:44:29.1321890Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1321965Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1322229Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1322277Z Running 1 items in this shard 2025-12-04T12:44:29.1322279Z 2025-12-04T12:44:29.1322613Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda I1204 12:43:37.261000 349602 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 349671 2025-12-04T12:44:29.1322781Z I1204 12:43:37.262000 349602 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 349672 2025-12-04T12:44:29.1322937Z I1204 12:43:37.262000 349602 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 349673 2025-12-04T12:44:29.1323088Z I1204 12:43:37.263000 349602 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 349674 2025-12-04T12:44:29.1323765Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1323809Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1324477Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1324522Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1325192Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1325238Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1325899Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_hsdp_dtensor_state_dict.py:83: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1325941Z FSDP.set_state_dict_type( 2025-12-04T12:44:29.1326455Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1326505Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1326990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1327051Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1327540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1327601Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1328085Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 2025-12-04T12:44:29.1328135Z device = _get_pg_default_device(group) 2025-12-04T12:44:29.1328271Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1328431Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1328712Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1328861Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1329142Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1329258Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1329542Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1329716Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1329988Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1330130Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1330403Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1330551Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1330822Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1330967Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1331449Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3210739712. 2025-12-04T12:44:29.1331586Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1331791Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1332167Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1332278Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1332482Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1332642Z E1204 12:43:46.003000 349671 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1332774Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1332929Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1333207Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1333357Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1333635Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1333753Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1334038Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1334179Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1334449Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1334589Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1334868Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1334997Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1335271Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1335412Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1335893Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 3. CUDA driver allocated memory was 1254096896 and is now 3051356160. 2025-12-04T12:44:29.1336014Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1336203Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1336588Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1336695Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1336900Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1337060Z E1204 12:43:46.063000 349674 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1337190Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1337345Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1337622Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1337771Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1338048Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1338176Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1338445Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1338585Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1338855Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1338994Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1339275Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1339404Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1339714Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1339868Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1340340Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 2. CUDA driver allocated memory was 1268776960 and is now 3051356160. 2025-12-04T12:44:29.1340462Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1340652Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1341028Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1341136Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1341341Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1341501Z E1204 12:43:46.080000 349673 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1341632Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1341786Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1342064Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1342213Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1342503Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1342621Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1342886Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1343031Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1343299Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1343454Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1343725Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1343851Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1344123Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1344274Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1344748Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1268776960 and is now 3051356160. 2025-12-04T12:44:29.1344870Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1345059Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1345434Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1345541Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1345751Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1345908Z E1204 12:43:46.083000 349672 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1345953Z FAILED [10.0218s] [100%] 2025-12-04T12:44:29.1345955Z 2025-12-04T12:44:29.1346012Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1346123Z __ TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda ___ 2025-12-04T12:44:29.1346171Z Traceback (most recent call last): 2025-12-04T12:44:29.1346338Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1346382Z self._join_processes(fn) 2025-12-04T12:44:29.1346558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1346627Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1346806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1346853Z raise RuntimeError(error) 2025-12-04T12:44:29.1346933Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1346981Z Traceback (most recent call last): 2025-12-04T12:44:29.1347144Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1347191Z getattr(self, test_name)() 2025-12-04T12:44:29.1347352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1347390Z fn() 2025-12-04T12:44:29.1347559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1347608Z method(*args, **kwargs) 2025-12-04T12:44:29.1347759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1347802Z method(*args, **kwargs) 2025-12-04T12:44:29.1347954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1347995Z with policy(): 2025-12-04T12:44:29.1348147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1348206Z raise RuntimeError(msg) 2025-12-04T12:44:29.1348559Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3210739712. 2025-12-04T12:44:29.1348572Z 2025-12-04T12:44:29.1348650Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1348907Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1348909Z 2025-12-04T12:44:29.1348998Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1349000Z 2025-12-04T12:44:29.1349002Z 2025-12-04T12:44:29.1349080Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1349167Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1349444Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-66cb2d4bdeb6a169.xml - 2025-12-04T12:44:29.1349506Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1350042Z FAILED [10.0218s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1350089Z Traceback (most recent call last): 2025-12-04T12:44:29.1350257Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1350301Z getattr(self, test_name)() 2025-12-04T12:44:29.1350465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1350502Z fn() 2025-12-04T12:44:29.1350652Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1350697Z method(*args, **kwargs) 2025-12-04T12:44:29.1350865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1350908Z method(*args, **kwargs) 2025-12-04T12:44:29.1351058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1351098Z with policy(): 2025-12-04T12:44:29.1351249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1351294Z raise RuntimeError(msg) 2025-12-04T12:44:29.1351648Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda! Caching allocator allocated memory was 0 and is now reported as 13824 on device 0. CUDA driver allocated memory was 1438646272 and is now 3210739712. 2025-12-04T12:44:29.1351651Z 2025-12-04T12:44:29.1351745Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1352002Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1352004Z 2025-12-04T12:44:29.1352096Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1352162Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1352226Z ======================= 1 failed, 7 deselected in 10.03s ======================= 2025-12-04T12:44:29.1352284Z Got exit code 1 2025-12-04T12:44:29.1352494Z FAILED CONSISTENTLY: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda 2025-12-04T12:44:29.1352623Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:44:29.1352863Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-02c126275fdbc82a.xml 2025-12-04T12:44:29.1352924Z ============================= test session starts ============================== 2025-12-04T12:44:29.1353037Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1353082Z cachedir: .pytest_cache 2025-12-04T12:44:29.1353241Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1353290Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1353331Z configfile: pytest.ini 2025-12-04T12:44:29.1353499Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1353571Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1353628Z stepcurrent: skipping 7 already run items. 2025-12-04T12:44:29.1353671Z Running 1 items in this shard 2025-12-04T12:44:29.1353674Z 2025-12-04T12:44:29.1354005Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda I1204 12:43:50.179000 350072 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 350141 2025-12-04T12:44:29.1354161Z I1204 12:43:50.179000 350072 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 350142 2025-12-04T12:44:29.1354318Z I1204 12:43:50.180000 350072 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 350143 2025-12-04T12:44:29.1354473Z I1204 12:43:50.180000 350072 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 350144 2025-12-04T12:44:29.1355570Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1355700Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1356775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1356912Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1357970Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1358106Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1359169Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1359291Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1360184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1360282Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1360990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1361098Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1361800Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1361892Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1362610Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1362709Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1363405Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1363470Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1364168Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1364232Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1364927Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1365006Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1365703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1365763Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1365900Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1366066Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1366351Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1366498Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1366780Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1366910Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1367183Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1367343Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1367613Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1367758Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1368029Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1368163Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1368440Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1368581Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1369059Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 3. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1369170Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1369366Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1369777Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1369888Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1370095Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1370255Z E1204 12:43:58.880000 350144 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1370408Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1370562Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1370844Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1370992Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1371287Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1371401Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1371674Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1371833Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1372102Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1372245Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1372514Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1372646Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1372917Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1373061Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1373531Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 1. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1373641Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1373848Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1374220Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1374330Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1374536Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1374696Z E1204 12:43:58.903000 350142 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1374842Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1374996Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1375275Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1375421Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1375709Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1375825Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1376110Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1376250Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1376520Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1376662Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1376934Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1377065Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1377334Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1377478Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1377942Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 0. CUDA driver allocated memory was 1438646272 and is now 3208642560. 2025-12-04T12:44:29.1378054Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1378258Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1378628Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1378739Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1378942Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1379115Z E1204 12:43:58.905000 350141 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1379252Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1379408Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1379733Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1379895Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1380178Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1380319Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1380596Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1380738Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1381010Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1381153Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1381425Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1381556Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1381828Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1381974Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1382442Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1382567Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1382759Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1383136Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1383249Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1383468Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1383631Z E1204 12:43:58.916000 350143 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1383673Z FAILED [10.0235s] [100%] 2025-12-04T12:44:29.1383675Z 2025-12-04T12:44:29.1383735Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1383843Z ____ TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda ____ 2025-12-04T12:44:29.1383895Z Traceback (most recent call last): 2025-12-04T12:44:29.1384059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1384118Z self._join_processes(fn) 2025-12-04T12:44:29.1384293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1384351Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1384547Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1384591Z raise RuntimeError(error) 2025-12-04T12:44:29.1384675Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.1384721Z Traceback (most recent call last): 2025-12-04T12:44:29.1384888Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1384931Z getattr(self, test_name)() 2025-12-04T12:44:29.1385096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1385132Z fn() 2025-12-04T12:44:29.1385288Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1385330Z method(*args, **kwargs) 2025-12-04T12:44:29.1385487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1385529Z method(*args, **kwargs) 2025-12-04T12:44:29.1385683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1385722Z with policy(): 2025-12-04T12:44:29.1385878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1385919Z raise RuntimeError(msg) 2025-12-04T12:44:29.1386272Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1386274Z 2025-12-04T12:44:29.1386351Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1386620Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1386623Z 2025-12-04T12:44:29.1386714Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1386717Z 2025-12-04T12:44:29.1386718Z 2025-12-04T12:44:29.1386795Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1386886Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1387157Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-02c126275fdbc82a.xml - 2025-12-04T12:44:29.1387222Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1387505Z FAILED [10.0235s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.1387555Z Traceback (most recent call last): 2025-12-04T12:44:29.1387719Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1387766Z getattr(self, test_name)() 2025-12-04T12:44:29.1387927Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1387977Z fn() 2025-12-04T12:44:29.1388129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1388173Z method(*args, **kwargs) 2025-12-04T12:44:29.1388329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1388381Z method(*args, **kwargs) 2025-12-04T12:44:29.1388536Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1388575Z with policy(): 2025-12-04T12:44:29.1388729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1388771Z raise RuntimeError(msg) 2025-12-04T12:44:29.1389120Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1389123Z 2025-12-04T12:44:29.1389198Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1389456Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1389459Z 2025-12-04T12:44:29.1389547Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1389726Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1389789Z ======================= 1 failed, 7 deselected in 10.03s ======================= 2025-12-04T12:44:29.1389831Z Got exit code 1 2025-12-04T12:44:29.1389872Z Retrying single test... 2025-12-04T12:44:29.1390099Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-28db2e4f1bd8cda0.xml 2025-12-04T12:44:29.1390162Z ============================= test session starts ============================== 2025-12-04T12:44:29.1390277Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1390323Z cachedir: .pytest_cache 2025-12-04T12:44:29.1390498Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1390551Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1390593Z configfile: pytest.ini 2025-12-04T12:44:29.1390762Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1390835Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1391086Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1391131Z Running 1 items in this shard 2025-12-04T12:44:29.1391133Z 2025-12-04T12:44:29.1391482Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda I1204 12:44:02.782000 350542 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 350611 2025-12-04T12:44:29.1391640Z I1204 12:44:02.783000 350542 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 350612 2025-12-04T12:44:29.1391797Z I1204 12:44:02.784000 350542 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 350613 2025-12-04T12:44:29.1391948Z I1204 12:44:02.784000 350542 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 350614 2025-12-04T12:44:29.1393030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1393171Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1394237Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1394361Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1395424Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1395551Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1396618Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1396743Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1397457Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1397580Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1398285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1398379Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1399087Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1399178Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1399944Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1400034Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1400752Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1400815Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1401527Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1401592Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1402290Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1402367Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1403068Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1403139Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1403279Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1403436Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1403723Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1403875Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1404157Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1404276Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1404548Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1404693Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1404976Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1405122Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1405390Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1405524Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1405796Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1405952Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1406422Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1258291200 and is now 3047161856. 2025-12-04T12:44:29.1406532Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1406736Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1407114Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1407239Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1407447Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1407604Z E1204 12:44:11.619000 350613 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1407740Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1407897Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1408185Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1408334Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1408616Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1408732Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1409008Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1409153Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1409434Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1409623Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1409894Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1410027Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1410312Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1410457Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1410925Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 1. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1411051Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1411243Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1411619Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1411743Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1411947Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1412107Z E1204 12:44:11.620000 350612 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1412239Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1412397Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1412680Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1412827Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1413107Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1413223Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1413496Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1413652Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1413923Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1414067Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1414335Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1414467Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1414759Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1414904Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1415373Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 0. CUDA driver allocated memory was 1438646272 and is now 3206545408. 2025-12-04T12:44:29.1415496Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1415689Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1416075Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1416184Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1416387Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1416551Z E1204 12:44:11.632000 350611 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1416681Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1416836Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1417115Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1417263Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1417544Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1417659Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1417931Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1418085Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1418357Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1418498Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1418769Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1418912Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1419183Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1419325Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1419827Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 3. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1419958Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1420149Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1420539Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1420650Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1420854Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1421015Z E1204 12:44:11.634000 350614 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1421056Z FAILED [10.3199s] [100%] 2025-12-04T12:44:29.1421058Z 2025-12-04T12:44:29.1421121Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1421232Z ____ TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda ____ 2025-12-04T12:44:29.1421285Z Traceback (most recent call last): 2025-12-04T12:44:29.1421451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1421500Z self._join_processes(fn) 2025-12-04T12:44:29.1421673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1421732Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1421911Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1421959Z raise RuntimeError(error) 2025-12-04T12:44:29.1422040Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1422091Z Traceback (most recent call last): 2025-12-04T12:44:29.1422266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1422313Z getattr(self, test_name)() 2025-12-04T12:44:29.1422475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1422511Z fn() 2025-12-04T12:44:29.1422667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1422709Z method(*args, **kwargs) 2025-12-04T12:44:29.1422867Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1422909Z method(*args, **kwargs) 2025-12-04T12:44:29.1423081Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1423121Z with policy(): 2025-12-04T12:44:29.1423280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1423323Z raise RuntimeError(msg) 2025-12-04T12:44:29.1423679Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 0. CUDA driver allocated memory was 1438646272 and is now 3206545408. 2025-12-04T12:44:29.1423694Z 2025-12-04T12:44:29.1423770Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1424026Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1424039Z 2025-12-04T12:44:29.1424128Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1424134Z 2025-12-04T12:44:29.1424196Z Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.1424246Z Traceback (most recent call last): 2025-12-04T12:44:29.1424409Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1424457Z getattr(self, test_name)() 2025-12-04T12:44:29.1424617Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1424657Z fn() 2025-12-04T12:44:29.1424809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1424854Z method(*args, **kwargs) 2025-12-04T12:44:29.1425005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1425050Z method(*args, **kwargs) 2025-12-04T12:44:29.1425202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1425245Z with policy(): 2025-12-04T12:44:29.1425399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1425443Z raise RuntimeError(msg) 2025-12-04T12:44:29.1425790Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1258291200 and is now 3047161856. 2025-12-04T12:44:29.1425794Z 2025-12-04T12:44:29.1425873Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1426140Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1426148Z 2025-12-04T12:44:29.1426235Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1426237Z 2025-12-04T12:44:29.1426239Z 2025-12-04T12:44:29.1426319Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1426406Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1426684Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-28db2e4f1bd8cda0.xml - 2025-12-04T12:44:29.1426747Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1427031Z FAILED [10.3199s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T12:44:29.1427080Z Traceback (most recent call last): 2025-12-04T12:44:29.1427248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1427291Z getattr(self, test_name)() 2025-12-04T12:44:29.1427454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1427490Z fn() 2025-12-04T12:44:29.1427644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1427701Z method(*args, **kwargs) 2025-12-04T12:44:29.1427852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1427894Z method(*args, **kwargs) 2025-12-04T12:44:29.1428056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1428098Z with policy(): 2025-12-04T12:44:29.1428252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1428297Z raise RuntimeError(msg) 2025-12-04T12:44:29.1428647Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 0. CUDA driver allocated memory was 1438646272 and is now 3206545408. 2025-12-04T12:44:29.1428650Z 2025-12-04T12:44:29.1428729Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1428983Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1428986Z 2025-12-04T12:44:29.1429078Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1429080Z 2025-12-04T12:44:29.1429140Z Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.1429189Z Traceback (most recent call last): 2025-12-04T12:44:29.1429353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1429398Z getattr(self, test_name)() 2025-12-04T12:44:29.1429559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1429723Z fn() 2025-12-04T12:44:29.1429878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1429918Z method(*args, **kwargs) 2025-12-04T12:44:29.1430074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1430141Z method(*args, **kwargs) 2025-12-04T12:44:29.1430296Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1430333Z with policy(): 2025-12-04T12:44:29.1430488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1430530Z raise RuntimeError(msg) 2025-12-04T12:44:29.1430879Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1258291200 and is now 3047161856. 2025-12-04T12:44:29.1430882Z 2025-12-04T12:44:29.1430956Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1431225Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1431227Z 2025-12-04T12:44:29.1431319Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1431384Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1431451Z ======================= 1 failed, 7 deselected in 10.33s ======================= 2025-12-04T12:44:29.1431489Z Got exit code 1 2025-12-04T12:44:29.1431533Z Retrying single test... 2025-12-04T12:44:29.1431776Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-8c7b7a26ccdec75d.xml 2025-12-04T12:44:29.1431839Z ============================= test session starts ============================== 2025-12-04T12:44:29.1431955Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1432013Z cachedir: .pytest_cache 2025-12-04T12:44:29.1432171Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1432223Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1432266Z configfile: pytest.ini 2025-12-04T12:44:29.1432434Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1432507Z collecting ... collected 8 items / 7 deselected / 1 selected 2025-12-04T12:44:29.1432757Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1432802Z Running 1 items in this shard 2025-12-04T12:44:29.1432804Z 2025-12-04T12:44:29.1433137Z distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda I1204 12:44:15.600000 351012 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 351081 2025-12-04T12:44:29.1433294Z I1204 12:44:15.601000 351012 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 351082 2025-12-04T12:44:29.1433450Z I1204 12:44:15.601000 351012 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 351083 2025-12-04T12:44:29.1433603Z I1204 12:44:15.602000 351012 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 351084 2025-12-04T12:44:29.1434694Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1434823Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1435898Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1436022Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1437084Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1437230Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1438292Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.) 2025-12-04T12:44:29.1438413Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T12:44:29.1439128Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1439234Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1439983Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1440080Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1440788Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1440883Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1441580Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1441697Z prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1442400Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1442467Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1443162Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1443226Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1443919Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1443982Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1444697Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:829: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html . 2025-12-04T12:44:29.1444756Z FullyShardedDataParallel.set_state_dict_type( 2025-12-04T12:44:29.1444894Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1445049Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1445345Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1445494Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1445779Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1445898Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1446180Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1446325Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1446611Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1446753Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1447020Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1447151Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1447425Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1447567Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1448043Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1448155Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1448348Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1448735Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1448845Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1449053Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1449211Z E1204 12:44:24.432000 351083 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T12:44:29.1449344Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1449497Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1449831Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1449978Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1450257Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1450389Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1450659Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1450818Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1451088Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1451230Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1451498Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1451630Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1451902Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1452045Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1452521Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 0. CUDA driver allocated memory was 1438646272 and is now 3210739712. 2025-12-04T12:44:29.1452629Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1452820Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1453207Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1453319Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1453523Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1453685Z E1204 12:44:24.465000 351081 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T12:44:29.1453815Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1453977Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1454258Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1454404Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1454683Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1454808Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1455079Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1455230Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1455498Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1455639Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1455908Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1456039Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1456314Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1456458Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1456920Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 3. CUDA driver allocated memory was 1260388352 and is now 3047161856. 2025-12-04T12:44:29.1457030Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1457222Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1457606Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1457715Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1457917Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1458077Z E1204 12:44:24.511000 351084 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T12:44:29.1458218Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T12:44:29.1458376Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T12:44:29.1458660Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1458806Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T12:44:29.1459084Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1459212Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T12:44:29.1459485Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1459679Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1459949Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1460090Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T12:44:29.1460360Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1460491Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T12:44:29.1460764Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1460906Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T12:44:29.1461373Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 1. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1461485Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1461697Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1462068Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1462178Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T12:44:29.1462382Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1462540Z E1204 12:44:24.512000 351082 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T12:44:29.1462596Z FAILED [10.1208s] [100%] 2025-12-04T12:44:29.1462598Z 2025-12-04T12:44:29.1462660Z =================================== FAILURES =================================== 2025-12-04T12:44:29.1462768Z ____ TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda ____ 2025-12-04T12:44:29.1462818Z Traceback (most recent call last): 2025-12-04T12:44:29.1462981Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T12:44:29.1463028Z self._join_processes(fn) 2025-12-04T12:44:29.1463201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T12:44:29.1463276Z self._check_return_codes(fn, elapsed_time) 2025-12-04T12:44:29.1463460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T12:44:29.1463505Z raise RuntimeError(error) 2025-12-04T12:44:29.1463599Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.1463645Z Traceback (most recent call last): 2025-12-04T12:44:29.1463815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1463857Z getattr(self, test_name)() 2025-12-04T12:44:29.1464021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1464057Z fn() 2025-12-04T12:44:29.1464211Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1464252Z method(*args, **kwargs) 2025-12-04T12:44:29.1464406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1464447Z method(*args, **kwargs) 2025-12-04T12:44:29.1464602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1464640Z with policy(): 2025-12-04T12:44:29.1464795Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1464836Z raise RuntimeError(msg) 2025-12-04T12:44:29.1465185Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1465188Z 2025-12-04T12:44:29.1465262Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1465519Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1465522Z 2025-12-04T12:44:29.1465625Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1465627Z 2025-12-04T12:44:29.1465629Z 2025-12-04T12:44:29.1465705Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T12:44:29.1465794Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T12:44:29.1466065Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-8c7b7a26ccdec75d.xml - 2025-12-04T12:44:29.1466131Z =========================== short test summary info ============================ 2025-12-04T12:44:29.1466400Z FAILED [10.1208s] distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T12:44:29.1466461Z Traceback (most recent call last): 2025-12-04T12:44:29.1466628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T12:44:29.1466674Z getattr(self, test_name)() 2025-12-04T12:44:29.1466834Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T12:44:29.1466872Z fn() 2025-12-04T12:44:29.1467023Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1467076Z method(*args, **kwargs) 2025-12-04T12:44:29.1467230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T12:44:29.1467269Z method(*args, **kwargs) 2025-12-04T12:44:29.1467423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T12:44:29.1467472Z with policy(): 2025-12-04T12:44:29.1467628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T12:44:29.1467669Z raise RuntimeError(msg) 2025-12-04T12:44:29.1468017Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda! Caching allocator allocated memory was 0 and is now reported as 512 on device 2. CUDA driver allocated memory was 1268776960 and is now 3047161856. 2025-12-04T12:44:29.1468019Z 2025-12-04T12:44:29.1468094Z To execute this test, run the following from the base repo dir: 2025-12-04T12:44:29.1468349Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_hsdp_dtensor_state_dict.py TestHSDPWithDeviceMeshAndDTensorCUDA.test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1468351Z 2025-12-04T12:44:29.1468440Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T12:44:29.1468508Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T12:44:29.1468570Z ======================= 1 failed, 7 deselected in 10.13s ======================= 2025-12-04T12:44:29.1468609Z Got exit code 1 2025-12-04T12:44:29.1468816Z FAILED CONSISTENTLY: test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda 2025-12-04T12:44:29.1468946Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T12:44:29.1469174Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-6462e47ce8fb2aa4.xml 2025-12-04T12:44:29.1469232Z ============================= test session starts ============================== 2025-12-04T12:44:29.1469349Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T12:44:29.1469391Z cachedir: .pytest_cache 2025-12-04T12:44:29.1469564Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T12:44:29.1469650Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T12:44:29.1469694Z configfile: pytest.ini 2025-12-04T12:44:29.1469856Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T12:44:29.1469930Z collecting ... collected 8 items / 8 deselected / 0 selected 2025-12-04T12:44:29.1469983Z stepcurrent: skipping 8 already run items. 2025-12-04T12:44:29.1470029Z Running 0 items in this shard 2025-12-04T12:44:29.1470031Z 2025-12-04T12:44:29.1470300Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_hsdp_dtensor_state_dict/distributed.fsdp.test_hsdp_dtensor_state_dict-6462e47ce8fb2aa4.xml - 2025-12-04T12:44:29.1470381Z ============================ 8 deselected in 0.00s ============================= 2025-12-04T12:44:29.1472150Z The following tests failed consistently: ['test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_cuda', 'test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_cuda', 'test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_cuda', 'test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_cuda', 'test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_cuda', 'test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_cuda', 'test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_hsdp_init_with_device_mesh_cuda', 'test/distributed/fsdp/test_hsdp_dtensor_state_dict.py::TestHSDPWithDeviceMeshAndDTensorCUDA::test_root_module_is_not_FSDP_cuda'] 2025-12-04T12:44:29.1472180Z 2025-12-04T12:44:29.1472404Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 (test/test-reports/distributed.fsdp.test_hsdp_dtensor_state_dict_1.1_60de516b7e1e2204_.log) 2025-12-04T12:44:29.1472407Z 2025-12-04T12:44:29.1472549Z Finished distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 ... [2025-12-04 12:44:29.028174][5230310.007213409], took 5.24min 2025-12-04T12:44:29.1472820Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:44:29.1472908Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:44:29.1473007Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T12:44:29.1473055Z Uploading artifacts took 0.00 seconds 2025-12-04T12:44:29.1473129Z distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 failed! 2025-12-04T12:44:29.1473247Z Running distributed/fsdp/test_fsdp_hybrid_shard 1/1 ... [2025-12-04 12:44:29.031021][5230310.010063092] 2025-12-04T12:44:29.1473299Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:44:29.1473625Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_hybrid_shard.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:44:29.031206] 2025-12-04T12:45:30.0435109Z 2025-12-04T12:45:30.0436233Z distributed/fsdp/test_fsdp_hybrid_shard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_hybrid_shard_1.1_e89f2503325d3e91_.log 2025-12-04T12:45:30.0438965Z Running 6 items in this shard: test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_fsdp_hybrid_shard_basic_setup, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_fsdp_hybrid_shard_parity, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_hsdp_save_load_state_dict, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_hsdp_sync_module_state, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_invalid_pg_specification_raises, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_raises_manual_wrap_hybrid_shard_when_none_policy 2025-12-04T12:45:30.0440743Z 2025-12-04T12:45:30.0440972Z Finished distributed/fsdp/test_fsdp_hybrid_shard 1/1 ... [2025-12-04 12:45:30.043272][5230371.02231184], took 1.02min 2025-12-04T12:45:30.0449716Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:45:30.0459107Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:45:30.0461633Z Running distributed/_composable/fsdp/test_fully_shard_training 1/1 ... [2025-12-04 12:45:30.046068][5230371.025108924] 2025-12-04T12:45:30.0461962Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:45:30.0463664Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_training.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:45:30.046260] 2025-12-04T12:54:16.2445970Z 2025-12-04T12:54:16.2447190Z distributed/_composable/fsdp/test_fully_shard_training 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_training_1.1_30a13ba1cb8fc7b7_.log 2025-12-04T12:54:16.2497292Z Running 25 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardForwardInputs::test_root_move_forward_input_to_device, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardRegisteredParams::test_param_registration_after_backward, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardRegisteredParams::test_param_registration_after_forward, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardCastAfterInit::test_to_float64_after_init, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_explicit_prefetching, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_multi_forward_module, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_non_root_forward_backward, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_post_optim_event, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_multi_group, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_multi_group_cpu_offload_eager, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_multi_group_unshard_async_op, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_single_group_shard_dim0, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCore::test_train_parity_single_group_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShard1DTrainingCompose::test_train_parity_with_activation_checkpointing, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardShardPlacementFnMultiProcess::test_train_parity_shard_placement_fn_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardShardPlacementFnMultiThread::test_shard_placement_fn_contiguous_params_grads, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardSharedParams::test_train_parity_with_shared_params, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardGradientAccumulation::test_1f1b_microbatching, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardGradientAccumulation::test_gradient_accumulation, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardNDTraining::test_2d_mlp_with_nd_mesh, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardHSDP3DTraining::test_3d_mlp_with_nd_mesh, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardHSDPTraining::test_train_parity_hsdp, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardCustomForwardMethod::test_register_fsdp_forward_method, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardShareCommContext::test_share_comm_context, test/distributed/_composable/fsdp/test_fully_shard_training.py::TestFullyShardWorldSize1::test_train_parity_single_worldsize1 2025-12-04T12:54:16.2719746Z 2025-12-04T12:54:16.2721763Z Finished distributed/_composable/fsdp/test_fully_shard_training 1/1 ... [2025-12-04 12:54:16.271871][5230897.250900075], took 8.77min 2025-12-04T12:54:16.2736913Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:54:16.2747282Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:54:16.2749159Z Running distributed/fsdp/test_fsdp_multiple_forward 1/1 ... [2025-12-04 12:54:16.274822][5230897.25386105] 2025-12-04T12:54:16.2749622Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:54:16.2752210Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_multiple_forward.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:54:16.275081] 2025-12-04T12:54:18.3477587Z 2025-12-04T12:54:18.3478650Z distributed/fsdp/test_fsdp_multiple_forward 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_multiple_forward_1.1_acc0220e24e9c56a_.log 2025-12-04T12:54:18.3479336Z Running 0 items in this shard: 2025-12-04T12:54:18.3479465Z 2025-12-04T12:54:18.3479782Z Finished distributed/fsdp/test_fsdp_multiple_forward 1/1 ... [2025-12-04 12:54:18.347491][5230899.326526629], took 0.03min 2025-12-04T12:54:18.3500576Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:54:18.3508989Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:54:18.3511419Z Running distributed/checkpoint/test_state_dict 1/1 ... [2025-12-04 12:54:18.351039][5230899.330080367] 2025-12-04T12:54:18.3511782Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:54:18.3513878Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/checkpoint/test_state_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:54:18.351262] 2025-12-04T12:56:53.7036885Z 2025-12-04T12:56:53.7037563Z distributed/checkpoint/test_state_dict 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_state_dict_1.1_ac0a269fa24e4fe1_.log 2025-12-04T12:56:53.7046613Z Running 25 items in this shard: test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_activation_ckpt_fqns_ddp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_activation_ckpt_fqns_fsdp1, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_broadcast_from_rank0, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_compiled_fsdp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_cpu_offload_full_state_dict, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_ddp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_deprecate_api, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_extra_state, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_flattened_osd, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_fsdp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_fsdp2, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_fsdp_ddp, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_fsdp_root_not_initialized, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_multi_device_load_model_state_dict, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_multi_param_groups, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_non_persistent_buffers, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_optim_state_dict_param_matching, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_set_cpu_model_state_dict_broadcast_from_rank0, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_setting_meta_device_model, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_setting_meta_device_model_broadcasting_and_memory, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_shared_weight, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_single_gpu, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_state_dict_with_hook_on_keys, test/distributed/checkpoint/test_state_dict.py::TestStateDict::test_strict, test/distributed/checkpoint/test_state_dict.py::TestNoComm::test_no_dist 2025-12-04T12:56:53.7052294Z 2025-12-04T12:56:53.7052515Z Finished distributed/checkpoint/test_state_dict 1/1 ... [2025-12-04 12:56:53.705001][5231054.684038806], took 2.59min 2025-12-04T12:56:53.7069530Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T12:56:53.7078101Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T12:56:53.7080554Z Running distributed/fsdp/test_fsdp_core 1/2 ... [2025-12-04 12:56:53.707951][5231054.686992511] 2025-12-04T12:56:53.7080928Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T12:56:53.7083049Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:56:53.708170] 2025-12-04T13:38:31.9224615Z 2025-12-04T13:38:31.9227424Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 1/2 (test/test-reports/distributed.fsdp.test_fsdp_core_1.2_d5d5bc8f8345486d_.log) 2025-12-04T13:38:31.9227931Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-bfe1494716c9ba3f.xml 2025-12-04T13:38:31.9228249Z ============================= test session starts ============================== 2025-12-04T13:38:31.9228505Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9228702Z cachedir: .pytest_cache 2025-12-04T13:38:31.9228929Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9229181Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9229313Z configfile: pytest.ini 2025-12-04T13:38:31.9229962Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9230217Z collecting ... collected 60 items 2025-12-04T13:38:31.9230361Z stepcurrent: Cannot find last run test, not skipping 2025-12-04T13:38:31.9236115Z Running 33 items in this shard: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:31.9242080Z 2025-12-04T13:38:31.9242396Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:56:55.485000 377415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 377484 2025-12-04T13:38:31.9242908Z I1204 12:56:55.486000 377415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 377485 2025-12-04T13:38:31.9243338Z I1204 12:56:55.486000 377415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 377486 2025-12-04T13:38:31.9243682Z I1204 12:56:55.487000 377415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 377487 2025-12-04T13:38:31.9244241Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9244690Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9245275Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9245907Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9246359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9247096Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9247532Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9247972Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9248567Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9249231Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9249826Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9250337Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9250914Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9251660Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9252241Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9252826Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9253155Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9253496Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9254008Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9254489Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9254973Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9255440Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9255923Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9256470Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9256933Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9257401Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9257867Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9258318Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9258774Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9259240Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9259978Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T13:38:31.9260609Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9260960Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9261579Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9262080Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9262446Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9262863Z [rank2]:E1204 12:57:02.822000 377486 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9263106Z dist init r=2, world=4 2025-12-04T13:38:31.9263325Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9263668Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9264168Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9264665Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9265168Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9265660Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9266140Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9266605Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9267072Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9267541Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9268013Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9268515Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9269017Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9269516Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9270228Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:38:31.9270988Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9271354Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9272019Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9272523Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9272908Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9273357Z [rank3]:E1204 12:57:02.827000 377487 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9273633Z dist init r=3, world=4 2025-12-04T13:38:31.9273839Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9274190Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9274718Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9275231Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9275755Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9276200Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9276642Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9277108Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9277571Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9278057Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9278531Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9278982Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9279442Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9279966Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9280686Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:38:31.9281329Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9281675Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9282281Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9282788Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9283159Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9283569Z [rank1]:E1204 12:57:02.828000 377485 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9283808Z dist init r=1, world=4 2025-12-04T13:38:31.9284007Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9284365Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9284855Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9285347Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9285826Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9286277Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9286717Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9287181Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9287643Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9288156Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9288657Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9289109Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9289664Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9290131Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9290791Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:38:31.9291410Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9291798Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9292381Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9292878Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9293245Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9293690Z [rank0]:E1204 12:57:02.880000 377484 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9293929Z dist init r=0, world=4 2025-12-04T13:38:31.9294347Z [rank0]:[W1204 12:57:03.136665255 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9295003Z FAILED [9.3193s] [ 3%] 2025-12-04T13:38:31.9295079Z 2025-12-04T13:38:31.9295162Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9295415Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____ 2025-12-04T13:38:31.9295609Z Traceback (most recent call last): 2025-12-04T13:38:31.9295887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9296145Z self._join_processes(fn) 2025-12-04T13:38:31.9296407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9296677Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9296956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9297224Z raise RuntimeError(error) 2025-12-04T13:38:31.9297386Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9297555Z Traceback (most recent call last): 2025-12-04T13:38:31.9297804Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9298064Z getattr(self, test_name)() 2025-12-04T13:38:31.9298309Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9298552Z fn() 2025-12-04T13:38:31.9298783Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9299245Z method(*args, **kwargs) 2025-12-04T13:38:31.9299666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9336557Z method(*args, **kwargs) 2025-12-04T13:38:31.9336865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9337105Z with policy(): 2025-12-04T13:38:31.9337333Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9337578Z raise RuntimeError(msg) 2025-12-04T13:38:31.9338005Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:38:31.9338447Z 2025-12-04T13:38:31.9338583Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9339027Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9339340Z 2025-12-04T13:38:31.9339438Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9339629Z 2025-12-04T13:38:31.9339699Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9339893Z Traceback (most recent call last): 2025-12-04T13:38:31.9340211Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9340553Z getattr(self, test_name)() 2025-12-04T13:38:31.9340855Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9341110Z fn() 2025-12-04T13:38:31.9341328Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9341616Z method(*args, **kwargs) 2025-12-04T13:38:31.9341884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9342198Z method(*args, **kwargs) 2025-12-04T13:38:31.9342432Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9342678Z with policy(): 2025-12-04T13:38:31.9342929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9343189Z raise RuntimeError(msg) 2025-12-04T13:38:31.9343645Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T13:38:31.9344065Z 2025-12-04T13:38:31.9344200Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9344555Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9344816Z 2025-12-04T13:38:31.9344913Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9345046Z 2025-12-04T13:38:31.9345108Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:31.9345258Z Traceback (most recent call last): 2025-12-04T13:38:31.9345514Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9345766Z getattr(self, test_name)() 2025-12-04T13:38:31.9346008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9346277Z fn() 2025-12-04T13:38:31.9346566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9346805Z method(*args, **kwargs) 2025-12-04T13:38:31.9347032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9347269Z method(*args, **kwargs) 2025-12-04T13:38:31.9347496Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9347761Z with policy(): 2025-12-04T13:38:31.9348027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9348268Z raise RuntimeError(msg) 2025-12-04T13:38:31.9348719Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:38:31.9349106Z 2025-12-04T13:38:31.9349184Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9349524Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9349877Z 2025-12-04T13:38:31.9349972Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9350172Z 2025-12-04T13:38:31.9350174Z 2025-12-04T13:38:31.9350258Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9350469Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9350845Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-bfe1494716c9ba3f.xml - 2025-12-04T13:38:31.9351220Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9351611Z FAILED [9.3193s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9352000Z Traceback (most recent call last): 2025-12-04T13:38:31.9352255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9352508Z getattr(self, test_name)() 2025-12-04T13:38:31.9352756Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9353037Z fn() 2025-12-04T13:38:31.9353330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9353575Z method(*args, **kwargs) 2025-12-04T13:38:31.9353858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9354099Z method(*args, **kwargs) 2025-12-04T13:38:31.9354330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9354565Z with policy(): 2025-12-04T13:38:31.9354783Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9355025Z raise RuntimeError(msg) 2025-12-04T13:38:31.9355448Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:38:31.9355833Z 2025-12-04T13:38:31.9355937Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9356281Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9356540Z 2025-12-04T13:38:31.9356636Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9356762Z 2025-12-04T13:38:31.9356827Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9356977Z Traceback (most recent call last): 2025-12-04T13:38:31.9357229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9357479Z getattr(self, test_name)() 2025-12-04T13:38:31.9357733Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9357976Z fn() 2025-12-04T13:38:31.9358189Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9358427Z method(*args, **kwargs) 2025-12-04T13:38:31.9358654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9358892Z method(*args, **kwargs) 2025-12-04T13:38:31.9359118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9359367Z with policy(): 2025-12-04T13:38:31.9359678Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9360304Z raise RuntimeError(msg) 2025-12-04T13:38:31.9361853Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T13:38:31.9363350Z 2025-12-04T13:38:31.9363554Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9364431Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9365106Z 2025-12-04T13:38:31.9365346Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9365690Z 2025-12-04T13:38:31.9365847Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:31.9366213Z Traceback (most recent call last): 2025-12-04T13:38:31.9366864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9367498Z getattr(self, test_name)() 2025-12-04T13:38:31.9368096Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9368647Z fn() 2025-12-04T13:38:31.9369049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9369506Z method(*args, **kwargs) 2025-12-04T13:38:31.9370011Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9370466Z method(*args, **kwargs) 2025-12-04T13:38:31.9370899Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9371349Z with policy(): 2025-12-04T13:38:31.9371772Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9372235Z raise RuntimeError(msg) 2025-12-04T13:38:31.9373159Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:38:31.9373958Z 2025-12-04T13:38:31.9374105Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9374768Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9375287Z 2025-12-04T13:38:31.9375461Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9375840Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9376239Z ============================== 1 failed in 9.48s =============================== 2025-12-04T13:38:31.9376508Z Got exit code 1 2025-12-04T13:38:31.9376703Z Retrying single test... 2025-12-04T13:38:31.9377211Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8a029b4db9ef552d.xml 2025-12-04T13:38:31.9377771Z ============================= test session starts ============================== 2025-12-04T13:38:31.9378208Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9378580Z cachedir: .pytest_cache 2025-12-04T13:38:31.9379019Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9379541Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9379780Z configfile: pytest.ini 2025-12-04T13:38:31.9380108Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9380543Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:31.9381038Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9381474Z Running 1 items in this shard 2025-12-04T13:38:31.9381582Z 2025-12-04T13:38:31.9382018Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:57:07.159000 377817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 377886 2025-12-04T13:38:31.9382722Z I1204 12:57:07.159000 377817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 377887 2025-12-04T13:38:31.9383228Z I1204 12:57:07.160000 377817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 377888 2025-12-04T13:38:31.9383728Z I1204 12:57:07.160000 377817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 377889 2025-12-04T13:38:31.9384530Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9385161Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9385786Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9386417Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9387291Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9388134Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9388792Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9389282Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9389983Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9390648Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9391153Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9391635Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9392272Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9392930Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9393591Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9394242Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9394517Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9394905Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9395456Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9395994Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9396537Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9397056Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9397559Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9398088Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9398624Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9399091Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9399557Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9400080Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9400557Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9401031Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9401699Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:38:31.9402374Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9402729Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9403343Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9403852Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9404228Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9404653Z [rank0]:E1204 12:57:14.348000 377886 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9404901Z dist init r=0, world=4 2025-12-04T13:38:31.9405113Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9405472Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9405967Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9406450Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9406931Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9407383Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9407848Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9408319Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9408796Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9409266Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9409799Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9410259Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9410722Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9411204Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9411864Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T13:38:31.9412526Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9412882Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9413474Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9413980Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9414351Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9414775Z [rank2]:E1204 12:57:14.356000 377888 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9415025Z dist init r=2, world=4 2025-12-04T13:38:31.9415235Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9415577Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9416070Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9416555Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9417044Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9417518Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9417970Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9418444Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9418916Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9419405Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9419927Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9420383Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9420842Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9421328Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9421994Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:38:31.9422628Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9422982Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9423574Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9424084Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9424455Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9424874Z [rank3]:E1204 12:57:14.368000 377889 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9425123Z dist init r=3, world=4 2025-12-04T13:38:31.9425333Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9425677Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9426172Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9426679Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9427161Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9427614Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9428059Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9428533Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9429028Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9429495Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9430045Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9430526Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9430994Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9431499Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9432166Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:38:31.9432791Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9433148Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9433753Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9434260Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9434633Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9435053Z [rank1]:E1204 12:57:14.376000 377887 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9435302Z dist init r=1, world=4 2025-12-04T13:38:31.9435731Z [rank0]:[W1204 12:57:14.510330657 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9436151Z FAILED [9.0196s] [100%] 2025-12-04T13:38:31.9436243Z 2025-12-04T13:38:31.9436306Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9436504Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____ 2025-12-04T13:38:31.9436691Z Traceback (most recent call last): 2025-12-04T13:38:31.9436946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9437200Z self._join_processes(fn) 2025-12-04T13:38:31.9437459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9437736Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9438035Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9438307Z raise RuntimeError(error) 2025-12-04T13:38:31.9438473Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9438650Z Traceback (most recent call last): 2025-12-04T13:38:31.9438902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9439153Z getattr(self, test_name)() 2025-12-04T13:38:31.9439396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9439704Z fn() 2025-12-04T13:38:31.9439917Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9440160Z method(*args, **kwargs) 2025-12-04T13:38:31.9440397Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9440666Z method(*args, **kwargs) 2025-12-04T13:38:31.9440899Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9441138Z with policy(): 2025-12-04T13:38:31.9441362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9441605Z raise RuntimeError(msg) 2025-12-04T13:38:31.9442032Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:38:31.9442421Z 2025-12-04T13:38:31.9442506Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9442855Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9443121Z 2025-12-04T13:38:31.9443220Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9443348Z 2025-12-04T13:38:31.9443350Z 2025-12-04T13:38:31.9443440Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9443652Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9444025Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8a029b4db9ef552d.xml - 2025-12-04T13:38:31.9444369Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9444722Z FAILED [9.0196s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9445057Z Traceback (most recent call last): 2025-12-04T13:38:31.9445337Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9445591Z getattr(self, test_name)() 2025-12-04T13:38:31.9445835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9446077Z fn() 2025-12-04T13:38:31.9446288Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9446531Z method(*args, **kwargs) 2025-12-04T13:38:31.9446762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9447002Z method(*args, **kwargs) 2025-12-04T13:38:31.9447247Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9447489Z with policy(): 2025-12-04T13:38:31.9447718Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9447961Z raise RuntimeError(msg) 2025-12-04T13:38:31.9448385Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:38:31.9448780Z 2025-12-04T13:38:31.9448863Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9449208Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9449476Z 2025-12-04T13:38:31.9449620Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9449840Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9450018Z ======================= 1 failed, 32 deselected in 9.18s ======================= 2025-12-04T13:38:31.9450167Z Got exit code 1 2025-12-04T13:38:31.9450278Z Retrying single test... 2025-12-04T13:38:31.9450545Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0a24c8ddda618e13.xml 2025-12-04T13:38:31.9450839Z ============================= test session starts ============================== 2025-12-04T13:38:31.9451064Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9451266Z cachedir: .pytest_cache 2025-12-04T13:38:31.9451501Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9451756Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9451887Z configfile: pytest.ini 2025-12-04T13:38:31.9452130Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9452414Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:31.9452757Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9453069Z Running 1 items in this shard 2025-12-04T13:38:31.9453152Z 2025-12-04T13:38:31.9453458Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:57:18.775000 378219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 378288 2025-12-04T13:38:31.9453970Z I1204 12:57:18.776000 378219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 378289 2025-12-04T13:38:31.9454343Z I1204 12:57:18.777000 378219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 378290 2025-12-04T13:38:31.9454698Z I1204 12:57:18.777000 378219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 378291 2025-12-04T13:38:31.9455257Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9455710Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9456321Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9456920Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9457384Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9457837Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9458416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9459035Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9459498Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9459983Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9460423Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9460866Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9461446Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9462042Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9462638Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9463232Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9463482Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9463835Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9464352Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9464843Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9465329Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9465789Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9466253Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9466727Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9467200Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9467673Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9468165Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9468629Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9469118Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9469667Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9470340Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240. 2025-12-04T13:38:31.9470969Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9471329Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9471922Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9472431Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9472806Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9473228Z [rank1]:E1204 12:57:25.942000 378289 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9473481Z dist init r=1, world=4 2025-12-04T13:38:31.9473708Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9474055Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9474555Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9475048Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9475552Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9476018Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9476465Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9476943Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9477434Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9477906Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9478399Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9478862Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9479332Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9479834Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9480503Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:38:31.9481131Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9481489Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9482086Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9482596Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9482984Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9483404Z [rank0]:E1204 12:57:25.945000 378288 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9483652Z dist init r=0, world=4 2025-12-04T13:38:31.9483863Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9484207Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9484705Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9485207Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9485696Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9486151Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9486603Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9487094Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9487573Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9488058Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9488530Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9488997Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9489468Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9490007Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9490786Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024. 2025-12-04T13:38:31.9491407Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9491762Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9492371Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9492883Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9493255Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9493675Z [rank2]:E1204 12:57:25.954000 378290 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9493927Z dist init r=2, world=4 2025-12-04T13:38:31.9494137Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9494495Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9494993Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9495479Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9495971Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9496449Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9496899Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9497383Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9497856Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9498329Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9498857Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9499434Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9499978Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9500486Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9501184Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376. 2025-12-04T13:38:31.9501836Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9502264Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9502886Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9503414Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9503831Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9504277Z [rank3]:E1204 12:57:25.994000 378291 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9504583Z dist init r=3, world=4 2025-12-04T13:38:31.9505046Z [rank0]:[W1204 12:57:26.110710486 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9505487Z FAILED [9.0183s] [100%] 2025-12-04T13:38:31.9505572Z 2025-12-04T13:38:31.9505662Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9505892Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____ 2025-12-04T13:38:31.9506122Z Traceback (most recent call last): 2025-12-04T13:38:31.9506405Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9506689Z self._join_processes(fn) 2025-12-04T13:38:31.9506982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9507305Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9507602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9507921Z raise RuntimeError(error) 2025-12-04T13:38:31.9508104Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9508303Z Traceback (most recent call last): 2025-12-04T13:38:31.9508591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9508870Z getattr(self, test_name)() 2025-12-04T13:38:31.9509153Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9509420Z fn() 2025-12-04T13:38:31.9509705Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9509991Z method(*args, **kwargs) 2025-12-04T13:38:31.9510250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9510518Z method(*args, **kwargs) 2025-12-04T13:38:31.9510783Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9511044Z with policy(): 2025-12-04T13:38:31.9511305Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9511573Z raise RuntimeError(msg) 2025-12-04T13:38:31.9512014Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:38:31.9512442Z 2025-12-04T13:38:31.9512546Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9512916Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9513197Z 2025-12-04T13:38:31.9513306Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9513462Z 2025-12-04T13:38:31.9513464Z 2025-12-04T13:38:31.9513555Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9513796Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9514203Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0a24c8ddda618e13.xml - 2025-12-04T13:38:31.9514583Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9514960Z FAILED [9.0183s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9515347Z Traceback (most recent call last): 2025-12-04T13:38:31.9515619Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9515926Z getattr(self, test_name)() 2025-12-04T13:38:31.9516198Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9516473Z fn() 2025-12-04T13:38:31.9516729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9516997Z method(*args, **kwargs) 2025-12-04T13:38:31.9517264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9517555Z method(*args, **kwargs) 2025-12-04T13:38:31.9517806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9518082Z with policy(): 2025-12-04T13:38:31.9518329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9518590Z raise RuntimeError(msg) 2025-12-04T13:38:31.9519054Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120. 2025-12-04T13:38:31.9519441Z 2025-12-04T13:38:31.9519539Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9519962Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9520240Z 2025-12-04T13:38:31.9520355Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9520577Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9520797Z ======================= 1 failed, 32 deselected in 9.16s ======================= 2025-12-04T13:38:31.9520973Z Got exit code 1 2025-12-04T13:38:31.9521231Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda 2025-12-04T13:38:31.9521619Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:31.9522006Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-91743904ac89e77a.xml 2025-12-04T13:38:31.9522345Z ============================= test session starts ============================== 2025-12-04T13:38:31.9522596Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9522823Z cachedir: .pytest_cache 2025-12-04T13:38:31.9523093Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9523363Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9523514Z configfile: pytest.ini 2025-12-04T13:38:31.9523793Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9524106Z collecting ... collected 60 items / 1 deselected / 59 selected 2025-12-04T13:38:31.9524306Z stepcurrent: skipping 1 already run items. 2025-12-04T13:38:31.9524476Z Running 32 items in this shard 2025-12-04T13:38:31.9524574Z 2025-12-04T13:38:31.9524930Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda I1204 12:57:30.534000 378621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 378690 2025-12-04T13:38:31.9525495Z I1204 12:57:30.535000 378621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 378691 2025-12-04T13:38:31.9525873Z I1204 12:57:30.535000 378621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 378692 2025-12-04T13:38:31.9526239Z I1204 12:57:30.536000 378621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 378693 2025-12-04T13:38:31.9526878Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9527358Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9527982Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9528601Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9529081Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9529619Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9530224Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9530848Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9531341Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9531805Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9532293Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9532789Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9533399Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9534018Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9534651Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9535280Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9535549Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9535925Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9536461Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9536996Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9537513Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9538018Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9538495Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9539011Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9539517Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9540054Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9540558Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9541044Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9541546Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9542048Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9542775Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9543470Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9543851Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9544522Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9545091Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9545488Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9545952Z [rank2]:E1204 12:57:36.503000 378692 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9546229Z dist init r=2, world=4 2025-12-04T13:38:31.9546477Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9546865Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9547386Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9547919Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9548425Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9548905Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9549392Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9549929Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9550437Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9550957Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9551445Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9551951Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9552441Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9552957Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9553687Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9554361Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9554753Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9555414Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9555992Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9556398Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9556856Z [rank1]:E1204 12:57:36.505000 378691 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9557146Z dist init r=1, world=4 2025-12-04T13:38:31.9557382Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9557766Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9558304Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9558812Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9559328Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9559848Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9560325Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9560831Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9561331Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9561827Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9562341Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9562858Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9563358Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9563857Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9564561Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9565273Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9565657Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9566316Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9566894Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9567293Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9613504Z [rank0]:E1204 12:57:36.547000 378690 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9613776Z dist init r=0, world=4 2025-12-04T13:38:31.9613997Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9614344Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9614842Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9615335Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9615813Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9616259Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9616695Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9617159Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9617622Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9618081Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9618652Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9619103Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9619551Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9620049Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9620766Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9621407Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9621754Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9622385Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9622932Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9623295Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9623706Z [rank3]:E1204 12:57:36.559000 378693 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9623945Z dist init r=3, world=4 2025-12-04T13:38:31.9624347Z [rank0]:[W1204 12:57:36.799763242 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9624752Z FAILED [7.7175s] [ 3%] 2025-12-04T13:38:31.9624816Z 2025-12-04T13:38:31.9624877Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9625090Z _ TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda _ 2025-12-04T13:38:31.9625288Z Traceback (most recent call last): 2025-12-04T13:38:31.9625532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9625776Z self._join_processes(fn) 2025-12-04T13:38:31.9626020Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9626283Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9626550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9626807Z raise RuntimeError(error) 2025-12-04T13:38:31.9626956Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9627115Z Traceback (most recent call last): 2025-12-04T13:38:31.9627366Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9627605Z getattr(self, test_name)() 2025-12-04T13:38:31.9627834Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9628065Z fn() 2025-12-04T13:38:31.9628263Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9628492Z method(*args, **kwargs) 2025-12-04T13:38:31.9628710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9628935Z method(*args, **kwargs) 2025-12-04T13:38:31.9629150Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9629387Z with policy(): 2025-12-04T13:38:31.9629631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9629860Z raise RuntimeError(msg) 2025-12-04T13:38:31.9630293Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9630695Z 2025-12-04T13:38:31.9630787Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9631149Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9631434Z 2025-12-04T13:38:31.9631525Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9631662Z 2025-12-04T13:38:31.9631722Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9631861Z Traceback (most recent call last): 2025-12-04T13:38:31.9632103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9632344Z getattr(self, test_name)() 2025-12-04T13:38:31.9632573Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9632803Z fn() 2025-12-04T13:38:31.9633001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9633230Z method(*args, **kwargs) 2025-12-04T13:38:31.9633446Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9633674Z method(*args, **kwargs) 2025-12-04T13:38:31.9633889Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9634112Z with policy(): 2025-12-04T13:38:31.9634320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9634549Z raise RuntimeError(msg) 2025-12-04T13:38:31.9634982Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9635385Z 2025-12-04T13:38:31.9635458Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9635815Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9636114Z 2025-12-04T13:38:31.9636202Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9636325Z 2025-12-04T13:38:31.9636327Z 2025-12-04T13:38:31.9636405Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9636604Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9636963Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-91743904ac89e77a.xml - 2025-12-04T13:38:31.9637287Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9637672Z FAILED [7.7175s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9638009Z Traceback (most recent call last): 2025-12-04T13:38:31.9638252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9638492Z getattr(self, test_name)() 2025-12-04T13:38:31.9638720Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9638950Z fn() 2025-12-04T13:38:31.9639147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9639384Z method(*args, **kwargs) 2025-12-04T13:38:31.9639630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9639856Z method(*args, **kwargs) 2025-12-04T13:38:31.9640071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9640307Z with policy(): 2025-12-04T13:38:31.9640517Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9640744Z raise RuntimeError(msg) 2025-12-04T13:38:31.9641178Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9641580Z 2025-12-04T13:38:31.9641653Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9642008Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9642286Z 2025-12-04T13:38:31.9642375Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9642497Z 2025-12-04T13:38:31.9642557Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9642691Z Traceback (most recent call last): 2025-12-04T13:38:31.9642928Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9643167Z getattr(self, test_name)() 2025-12-04T13:38:31.9643396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9643625Z fn() 2025-12-04T13:38:31.9643820Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9644046Z method(*args, **kwargs) 2025-12-04T13:38:31.9644261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9644486Z method(*args, **kwargs) 2025-12-04T13:38:31.9644723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9644946Z with policy(): 2025-12-04T13:38:31.9645152Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9645383Z raise RuntimeError(msg) 2025-12-04T13:38:31.9645811Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9646212Z 2025-12-04T13:38:31.9646287Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9646653Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9646934Z 2025-12-04T13:38:31.9647020Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9647204Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9647367Z ======================= 1 failed, 1 deselected in 7.86s ======================== 2025-12-04T13:38:31.9647503Z Got exit code 1 2025-12-04T13:38:31.9647597Z Retrying single test... 2025-12-04T13:38:31.9647862Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-27724013ff9ffa03.xml 2025-12-04T13:38:31.9648139Z ============================= test session starts ============================== 2025-12-04T13:38:31.9648347Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9648543Z cachedir: .pytest_cache 2025-12-04T13:38:31.9648763Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9649000Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9649115Z configfile: pytest.ini 2025-12-04T13:38:31.9649340Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9649683Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:31.9650033Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9650349Z Running 1 items in this shard 2025-12-04T13:38:31.9650421Z 2025-12-04T13:38:31.9650749Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda I1204 12:57:40.863000 379007 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 379076 2025-12-04T13:38:31.9651260Z I1204 12:57:40.863000 379007 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 379077 2025-12-04T13:38:31.9651601Z I1204 12:57:40.864000 379007 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 379078 2025-12-04T13:38:31.9651937Z I1204 12:57:40.865000 379007 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 379079 2025-12-04T13:38:31.9652482Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9652923Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9653525Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9654110Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9654554Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9654986Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9655428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9655858Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9656283Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9656709Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9657284Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9657878Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9658465Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9659044Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9659657Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9660231Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9660468Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9660807Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9661292Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9661768Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9662245Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9662706Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9663146Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9663604Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9664070Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9664525Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9664995Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9665439Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9665887Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9666360Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9667043Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9667691Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9668034Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9668638Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9669161Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9669524Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9669976Z [rank1]:E1204 12:57:47.048000 379077 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9670213Z dist init r=1, world=4 2025-12-04T13:38:31.9670420Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9670758Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9671244Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9671742Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9672222Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9672676Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9673116Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9673578Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9674055Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9674521Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9674985Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9675450Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9675906Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9676388Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9677067Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9677710Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9678058Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9678674Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9679198Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9679561Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9680023Z [rank3]:E1204 12:57:47.071000 379079 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9680265Z dist init r=3, world=4 2025-12-04T13:38:31.9680469Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9680806Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9681306Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9681784Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9682259Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9682707Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9683158Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9683622Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9684085Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9684547Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9685022Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9685477Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9685949Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9686414Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9687093Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9687734Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9688084Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9688694Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9689216Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9689621Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9690036Z [rank2]:E1204 12:57:47.087000 379078 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9690280Z dist init r=2, world=4 2025-12-04T13:38:31.9690499Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9690837Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9691328Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9691811Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9692303Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9692751Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9693190Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9693651Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9694128Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9694589Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9695064Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9695514Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9695969Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9696434Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9697117Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9697759Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9698108Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9698717Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9699242Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9699645Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9700058Z [rank0]:E1204 12:57:47.102000 379076 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9700303Z dist init r=0, world=4 2025-12-04T13:38:31.9700704Z [rank0]:[W1204 12:57:47.435284909 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9701114Z FAILED [8.0172s] [100%] 2025-12-04T13:38:31.9701178Z 2025-12-04T13:38:31.9701242Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9701476Z _ TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda _ 2025-12-04T13:38:31.9701678Z Traceback (most recent call last): 2025-12-04T13:38:31.9701927Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9702175Z self._join_processes(fn) 2025-12-04T13:38:31.9702421Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9702690Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9702961Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9703236Z raise RuntimeError(error) 2025-12-04T13:38:31.9703390Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9703553Z Traceback (most recent call last): 2025-12-04T13:38:31.9703794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9704050Z getattr(self, test_name)() 2025-12-04T13:38:31.9704286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9704521Z fn() 2025-12-04T13:38:31.9704723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9704959Z method(*args, **kwargs) 2025-12-04T13:38:31.9705180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9705415Z method(*args, **kwargs) 2025-12-04T13:38:31.9705639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9705869Z with policy(): 2025-12-04T13:38:31.9706084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9706320Z raise RuntimeError(msg) 2025-12-04T13:38:31.9706758Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9707162Z 2025-12-04T13:38:31.9707238Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9707603Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9707889Z 2025-12-04T13:38:31.9707978Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9708108Z 2025-12-04T13:38:31.9708171Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:31.9708312Z Traceback (most recent call last): 2025-12-04T13:38:31.9708568Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9708816Z getattr(self, test_name)() 2025-12-04T13:38:31.9709051Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9709288Z fn() 2025-12-04T13:38:31.9709490Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9709768Z method(*args, **kwargs) 2025-12-04T13:38:31.9709992Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9710221Z method(*args, **kwargs) 2025-12-04T13:38:31.9710457Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9710688Z with policy(): 2025-12-04T13:38:31.9710901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9711133Z raise RuntimeError(msg) 2025-12-04T13:38:31.9711572Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9711989Z 2025-12-04T13:38:31.9712065Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9712428Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9712725Z 2025-12-04T13:38:31.9712820Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9712943Z 2025-12-04T13:38:31.9712945Z 2025-12-04T13:38:31.9713027Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9713232Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9713590Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-27724013ff9ffa03.xml - 2025-12-04T13:38:31.9713922Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9714289Z FAILED [8.0172s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9714632Z Traceback (most recent call last): 2025-12-04T13:38:31.9714883Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9715126Z getattr(self, test_name)() 2025-12-04T13:38:31.9715362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9715597Z fn() 2025-12-04T13:38:31.9715802Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9716033Z method(*args, **kwargs) 2025-12-04T13:38:31.9716259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9716489Z method(*args, **kwargs) 2025-12-04T13:38:31.9716707Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9716938Z with policy(): 2025-12-04T13:38:31.9717168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9717403Z raise RuntimeError(msg) 2025-12-04T13:38:31.9717842Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9718248Z 2025-12-04T13:38:31.9718328Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9718691Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9718974Z 2025-12-04T13:38:31.9719073Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9719200Z 2025-12-04T13:38:31.9719260Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:31.9719401Z Traceback (most recent call last): 2025-12-04T13:38:31.9719671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9719915Z getattr(self, test_name)() 2025-12-04T13:38:31.9720148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9720381Z fn() 2025-12-04T13:38:31.9720607Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9720838Z method(*args, **kwargs) 2025-12-04T13:38:31.9721058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9721303Z method(*args, **kwargs) 2025-12-04T13:38:31.9721525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9721752Z with policy(): 2025-12-04T13:38:31.9721966Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9722201Z raise RuntimeError(msg) 2025-12-04T13:38:31.9722651Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9723062Z 2025-12-04T13:38:31.9723137Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9723499Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9723781Z 2025-12-04T13:38:31.9723870Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9724059Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9724226Z ======================= 1 failed, 32 deselected in 8.18s ======================= 2025-12-04T13:38:31.9724364Z Got exit code 1 2025-12-04T13:38:31.9724463Z Retrying single test... 2025-12-04T13:38:31.9724719Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7256f71ab4cb42b0.xml 2025-12-04T13:38:31.9725004Z ============================= test session starts ============================== 2025-12-04T13:38:31.9725217Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9725409Z cachedir: .pytest_cache 2025-12-04T13:38:31.9725650Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9725894Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9726015Z configfile: pytest.ini 2025-12-04T13:38:31.9726242Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9726518Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:31.9726764Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9726809Z Running 1 items in this shard 2025-12-04T13:38:31.9726811Z 2025-12-04T13:38:31.9727155Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda I1204 12:57:51.294000 379393 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 379462 2025-12-04T13:38:31.9727319Z I1204 12:57:51.295000 379393 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 379463 2025-12-04T13:38:31.9727469Z I1204 12:57:51.295000 379393 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 379464 2025-12-04T13:38:31.9727621Z I1204 12:57:51.296000 379393 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 379465 2025-12-04T13:38:31.9727980Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9728042Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9728536Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9728611Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9728969Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9729018Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9729512Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9729612Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9729968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9730013Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9730501Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9730563Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9730932Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9730980Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9731467Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9731529Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9731694Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9731860Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9732153Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9732308Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9732614Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9732741Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9733034Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9733182Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9733461Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9733610Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9733887Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9734027Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9734303Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9734453Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9734961Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9735079Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9735276Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9735651Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9735767Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9735986Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9736155Z [rank1]:E1204 12:57:57.268000 379463 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9736194Z dist init r=1, world=4 2025-12-04T13:38:31.9736333Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9736491Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9736792Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9736946Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9737243Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9737371Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9737652Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9737804Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9738080Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9738230Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9738511Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9738649Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9738934Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9739083Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9739630Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9739751Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9739948Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9740344Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9740460Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9740675Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9740839Z [rank0]:E1204 12:57:57.271000 379462 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9740895Z dist init r=0, world=4 2025-12-04T13:38:31.9741032Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9741198Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9741498Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9741654Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9741941Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9742066Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9742351Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9742499Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9742778Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9742925Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9743206Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9743344Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9743633Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9743785Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9744276Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9744396Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9744606Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9744980Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9745095Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9745322Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9745490Z [rank2]:E1204 12:57:57.274000 379464 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9745530Z dist init r=2, world=4 2025-12-04T13:38:31.9745683Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9745845Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9746140Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9746299Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9746587Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9746715Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9746997Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9747147Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9747425Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9747577Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9747872Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9748009Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9748291Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9748440Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9748945Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9749060Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9749260Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9749899Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9750027Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9750242Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9750420Z [rank3]:E1204 12:57:57.319000 379465 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9750463Z dist init r=3, world=4 2025-12-04T13:38:31.9750805Z [rank0]:[W1204 12:57:57.438701986 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9750850Z FAILED [7.7160s] [100%] 2025-12-04T13:38:31.9750853Z 2025-12-04T13:38:31.9750911Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9751030Z _ TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda _ 2025-12-04T13:38:31.9751078Z Traceback (most recent call last): 2025-12-04T13:38:31.9751246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9751294Z self._join_processes(fn) 2025-12-04T13:38:31.9751468Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9751525Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9751704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9751753Z raise RuntimeError(error) 2025-12-04T13:38:31.9751833Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9751882Z Traceback (most recent call last): 2025-12-04T13:38:31.9752044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9752091Z getattr(self, test_name)() 2025-12-04T13:38:31.9752264Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9752302Z fn() 2025-12-04T13:38:31.9752454Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9752500Z method(*args, **kwargs) 2025-12-04T13:38:31.9752651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9752697Z method(*args, **kwargs) 2025-12-04T13:38:31.9752850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9752893Z with policy(): 2025-12-04T13:38:31.9753064Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9753111Z raise RuntimeError(msg) 2025-12-04T13:38:31.9753480Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9753486Z 2025-12-04T13:38:31.9753564Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9753816Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9753832Z 2025-12-04T13:38:31.9753920Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9753922Z 2025-12-04T13:38:31.9753988Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9754048Z Traceback (most recent call last): 2025-12-04T13:38:31.9754216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9754259Z getattr(self, test_name)() 2025-12-04T13:38:31.9754422Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9754457Z fn() 2025-12-04T13:38:31.9754614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9754656Z method(*args, **kwargs) 2025-12-04T13:38:31.9754813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9754852Z method(*args, **kwargs) 2025-12-04T13:38:31.9755006Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9755045Z with policy(): 2025-12-04T13:38:31.9755201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9755243Z raise RuntimeError(msg) 2025-12-04T13:38:31.9755616Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9755620Z 2025-12-04T13:38:31.9755698Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9755944Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9755946Z 2025-12-04T13:38:31.9756038Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9756041Z 2025-12-04T13:38:31.9756124Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9756174Z Traceback (most recent call last): 2025-12-04T13:38:31.9756337Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9756383Z getattr(self, test_name)() 2025-12-04T13:38:31.9756543Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9756583Z fn() 2025-12-04T13:38:31.9756733Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9756777Z method(*args, **kwargs) 2025-12-04T13:38:31.9756931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9756988Z method(*args, **kwargs) 2025-12-04T13:38:31.9757145Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9757185Z with policy(): 2025-12-04T13:38:31.9757340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9757380Z raise RuntimeError(msg) 2025-12-04T13:38:31.9757753Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9757766Z 2025-12-04T13:38:31.9757840Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9758090Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9758103Z 2025-12-04T13:38:31.9758188Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9758190Z 2025-12-04T13:38:31.9758192Z 2025-12-04T13:38:31.9758270Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9758360Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9758596Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7256f71ab4cb42b0.xml - 2025-12-04T13:38:31.9758664Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9758933Z FAILED [7.7160s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9758982Z Traceback (most recent call last): 2025-12-04T13:38:31.9759147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9759192Z getattr(self, test_name)() 2025-12-04T13:38:31.9759353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9759391Z fn() 2025-12-04T13:38:31.9759542Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9759801Z method(*args, **kwargs) 2025-12-04T13:38:31.9759953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9759996Z method(*args, **kwargs) 2025-12-04T13:38:31.9760147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9760203Z with policy(): 2025-12-04T13:38:31.9760355Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9760398Z raise RuntimeError(msg) 2025-12-04T13:38:31.9760767Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9760772Z 2025-12-04T13:38:31.9760845Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9761116Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9761119Z 2025-12-04T13:38:31.9761208Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9761210Z 2025-12-04T13:38:31.9761272Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9761317Z Traceback (most recent call last): 2025-12-04T13:38:31.9761484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9761526Z getattr(self, test_name)() 2025-12-04T13:38:31.9761688Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9761736Z fn() 2025-12-04T13:38:31.9761889Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9761929Z method(*args, **kwargs) 2025-12-04T13:38:31.9762084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9762139Z method(*args, **kwargs) 2025-12-04T13:38:31.9762293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9762331Z with policy(): 2025-12-04T13:38:31.9762487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9762532Z raise RuntimeError(msg) 2025-12-04T13:38:31.9762898Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9762902Z 2025-12-04T13:38:31.9762980Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9763229Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9763232Z 2025-12-04T13:38:31.9763324Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9763326Z 2025-12-04T13:38:31.9763384Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9763433Z Traceback (most recent call last): 2025-12-04T13:38:31.9763595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9763641Z getattr(self, test_name)() 2025-12-04T13:38:31.9763803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9763841Z fn() 2025-12-04T13:38:31.9763995Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9764039Z method(*args, **kwargs) 2025-12-04T13:38:31.9764208Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9764248Z method(*args, **kwargs) 2025-12-04T13:38:31.9764403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9764440Z with policy(): 2025-12-04T13:38:31.9764595Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9764638Z raise RuntimeError(msg) 2025-12-04T13:38:31.9765027Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9765031Z 2025-12-04T13:38:31.9765106Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9765356Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9765358Z 2025-12-04T13:38:31.9765444Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9765511Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9765587Z ======================= 1 failed, 32 deselected in 7.88s ======================= 2025-12-04T13:38:31.9765628Z Got exit code 1 2025-12-04T13:38:31.9765831Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda 2025-12-04T13:38:31.9765962Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:31.9766168Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-147ebbffa2c93fc5.xml 2025-12-04T13:38:31.9766229Z ============================= test session starts ============================== 2025-12-04T13:38:31.9766346Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9766388Z cachedir: .pytest_cache 2025-12-04T13:38:31.9766550Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9766598Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9766643Z configfile: pytest.ini 2025-12-04T13:38:31.9766805Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9766884Z collecting ... collected 60 items / 2 deselected / 58 selected 2025-12-04T13:38:31.9766937Z stepcurrent: skipping 2 already run items. 2025-12-04T13:38:31.9766985Z Running 31 items in this shard 2025-12-04T13:38:31.9766987Z 2025-12-04T13:38:31.9767314Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda I1204 12:58:01.368000 379779 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 379848 2025-12-04T13:38:31.9767474Z I1204 12:58:01.369000 379779 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 379849 2025-12-04T13:38:31.9767628Z I1204 12:58:01.369000 379779 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 379850 2025-12-04T13:38:31.9767782Z I1204 12:58:01.370000 379779 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 379851 2025-12-04T13:38:31.9768163Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9768213Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9768569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9768617Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9768910Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9768986Z {} 2025-12-04T13:38:31.9769096Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9769174Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9769793Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9769875Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9770161Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9770229Z {} 2025-12-04T13:38:31.9770352Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9770428Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9770916Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9770983Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9771339Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9771389Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9771683Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9771745Z {} 2025-12-04T13:38:31.9771850Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9771923Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9772415Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9772489Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9772845Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9772891Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9773179Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9773245Z {} 2025-12-04T13:38:31.9773344Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9773432Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9773921Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9773985Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9774130Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9774308Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9774600Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9774768Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9775056Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9775182Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9775465Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9775614Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9775898Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9776046Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9776329Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9776470Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9776748Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9776911Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9777403Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9777523Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9777722Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9778112Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9778231Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9778446Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9778624Z [rank2]:E1204 12:58:07.410000 379850 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9778763Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9778937Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9779224Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9779381Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9779720Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9779844Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9780127Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9780276Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9780559Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9780707Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9780987Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9781142Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9781422Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9781572Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9782062Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9782193Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9782390Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9782765Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9782894Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9783107Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9783278Z [rank1]:E1204 12:58:07.410000 379849 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9783335Z dist init r=2, world=4 2025-12-04T13:38:31.9783378Z dist init r=1, world=4 2025-12-04T13:38:31.9783515Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9783675Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9783961Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9784121Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9784409Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9784533Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9784819Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9784966Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9785250Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9785408Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9785689Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9785825Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9786107Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9786261Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9786761Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9786880Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9787074Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9787465Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9787592Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9787800Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9787965Z [rank3]:E1204 12:58:07.412000 379851 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9788003Z dist init r=3, world=4 2025-12-04T13:38:31.9788142Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9788301Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9788592Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9788747Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9789036Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9789163Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9789442Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9789636Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9789930Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9790082Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9790375Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9790517Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9790815Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9790965Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9791459Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9791586Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9791784Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9792170Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9792285Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9792500Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9792666Z [rank0]:E1204 12:58:07.456000 379848 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9792709Z dist init r=0, world=4 2025-12-04T13:38:31.9793046Z [rank0]:[W1204 12:58:07.711940587 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9793090Z FAILED [7.8155s] [ 3%] 2025-12-04T13:38:31.9793092Z 2025-12-04T13:38:31.9793150Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9793269Z _ TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda _ 2025-12-04T13:38:31.9793317Z Traceback (most recent call last): 2025-12-04T13:38:31.9793485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9793532Z self._join_processes(fn) 2025-12-04T13:38:31.9793708Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9793763Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9793959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9794008Z raise RuntimeError(error) 2025-12-04T13:38:31.9794088Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9794139Z Traceback (most recent call last): 2025-12-04T13:38:31.9794303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9794349Z getattr(self, test_name)() 2025-12-04T13:38:31.9794510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9794550Z fn() 2025-12-04T13:38:31.9794704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9794750Z method(*args, **kwargs) 2025-12-04T13:38:31.9794913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9794959Z method(*args, **kwargs) 2025-12-04T13:38:31.9795112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9795154Z with policy(): 2025-12-04T13:38:31.9795310Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9795357Z raise RuntimeError(msg) 2025-12-04T13:38:31.9795739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9795741Z 2025-12-04T13:38:31.9795822Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9796081Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9796087Z 2025-12-04T13:38:31.9796176Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9796178Z 2025-12-04T13:38:31.9796241Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9796288Z Traceback (most recent call last): 2025-12-04T13:38:31.9796458Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9796503Z getattr(self, test_name)() 2025-12-04T13:38:31.9796667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9796703Z fn() 2025-12-04T13:38:31.9796860Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9796901Z method(*args, **kwargs) 2025-12-04T13:38:31.9797056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9797097Z method(*args, **kwargs) 2025-12-04T13:38:31.9797255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9797292Z with policy(): 2025-12-04T13:38:31.9797450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9797491Z raise RuntimeError(msg) 2025-12-04T13:38:31.9797866Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9797879Z 2025-12-04T13:38:31.9797957Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9798204Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9798206Z 2025-12-04T13:38:31.9798297Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9798300Z 2025-12-04T13:38:31.9798360Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:31.9798409Z Traceback (most recent call last): 2025-12-04T13:38:31.9798574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9798621Z getattr(self, test_name)() 2025-12-04T13:38:31.9798792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9798833Z fn() 2025-12-04T13:38:31.9798987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9799031Z method(*args, **kwargs) 2025-12-04T13:38:31.9799182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9799225Z method(*args, **kwargs) 2025-12-04T13:38:31.9799390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9799431Z with policy(): 2025-12-04T13:38:31.9799622Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9799668Z raise RuntimeError(msg) 2025-12-04T13:38:31.9800061Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9800064Z 2025-12-04T13:38:31.9800139Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9800387Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9800390Z 2025-12-04T13:38:31.9800476Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9800479Z 2025-12-04T13:38:31.9800480Z 2025-12-04T13:38:31.9800561Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9800650Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9800889Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-147ebbffa2c93fc5.xml - 2025-12-04T13:38:31.9800952Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9801219Z FAILED [7.8155s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:31.9801270Z Traceback (most recent call last): 2025-12-04T13:38:31.9801434Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9801482Z getattr(self, test_name)() 2025-12-04T13:38:31.9801646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9801685Z fn() 2025-12-04T13:38:31.9801852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9801897Z method(*args, **kwargs) 2025-12-04T13:38:31.9802050Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9802093Z method(*args, **kwargs) 2025-12-04T13:38:31.9802246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9802288Z with policy(): 2025-12-04T13:38:31.9802441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9802487Z raise RuntimeError(msg) 2025-12-04T13:38:31.9802872Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9802876Z 2025-12-04T13:38:31.9802955Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9803205Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9803207Z 2025-12-04T13:38:31.9803294Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9803311Z 2025-12-04T13:38:31.9803374Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9803419Z Traceback (most recent call last): 2025-12-04T13:38:31.9803585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9803640Z getattr(self, test_name)() 2025-12-04T13:38:31.9803804Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9803839Z fn() 2025-12-04T13:38:31.9803995Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9804035Z method(*args, **kwargs) 2025-12-04T13:38:31.9804188Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9804229Z method(*args, **kwargs) 2025-12-04T13:38:31.9804384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9804421Z with policy(): 2025-12-04T13:38:31.9804578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9804622Z raise RuntimeError(msg) 2025-12-04T13:38:31.9804994Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9804996Z 2025-12-04T13:38:31.9805074Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9805318Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9805321Z 2025-12-04T13:38:31.9805413Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9805415Z 2025-12-04T13:38:31.9805473Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:31.9805523Z Traceback (most recent call last): 2025-12-04T13:38:31.9805699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9805745Z getattr(self, test_name)() 2025-12-04T13:38:31.9805904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9805943Z fn() 2025-12-04T13:38:31.9806095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9806139Z method(*args, **kwargs) 2025-12-04T13:38:31.9806290Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9806333Z method(*args, **kwargs) 2025-12-04T13:38:31.9806483Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9806536Z with policy(): 2025-12-04T13:38:31.9806690Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9806736Z raise RuntimeError(msg) 2025-12-04T13:38:31.9807106Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9807109Z 2025-12-04T13:38:31.9807193Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9807439Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9807441Z 2025-12-04T13:38:31.9807528Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9807610Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9807674Z ======================= 1 failed, 2 deselected in 7.98s ======================== 2025-12-04T13:38:31.9807716Z Got exit code 1 2025-12-04T13:38:31.9807757Z Retrying single test... 2025-12-04T13:38:31.9807952Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-206ea0eb46205b47.xml 2025-12-04T13:38:31.9808011Z ============================= test session starts ============================== 2025-12-04T13:38:31.9808131Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9808173Z cachedir: .pytest_cache 2025-12-04T13:38:31.9808337Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9808385Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9808431Z configfile: pytest.ini 2025-12-04T13:38:31.9808601Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9808675Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:31.9808927Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9808972Z Running 1 items in this shard 2025-12-04T13:38:31.9808975Z 2025-12-04T13:38:31.9809307Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda I1204 12:58:11.565000 380165 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 380234 2025-12-04T13:38:31.9809463Z I1204 12:58:11.566000 380165 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 380235 2025-12-04T13:38:31.9809774Z I1204 12:58:11.566000 380165 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 380236 2025-12-04T13:38:31.9809925Z I1204 12:58:11.567000 380165 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 380237 2025-12-04T13:38:31.9810288Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9810343Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9810707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9810759Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9811052Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9811121Z {} 2025-12-04T13:38:31.9811226Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9811303Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9811812Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9811891Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9812182Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9812246Z {} 2025-12-04T13:38:31.9812354Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9812426Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9812921Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9812983Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9813340Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9813385Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9813738Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9813786Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9814069Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9814144Z {} 2025-12-04T13:38:31.9814245Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9814317Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9814806Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9814869Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9815166Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9815230Z {} 2025-12-04T13:38:31.9815329Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9815403Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9815896Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9815965Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9816116Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9816292Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9816589Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9816747Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9817040Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9817169Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9817448Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9817604Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9817883Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9818037Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9818316Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9818475Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9818753Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9818905Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9819413Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9819529Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9819776Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9820153Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9820284Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9820498Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9820677Z [rank2]:E1204 12:58:17.636000 380236 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9820717Z dist init r=2, world=4 2025-12-04T13:38:31.9820855Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9821019Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9821308Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9821465Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9821754Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9821883Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9822165Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9822313Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9822593Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9822756Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9823036Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9823173Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9823455Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9823622Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9824116Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9824237Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9824434Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9824818Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9824941Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9825153Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9825319Z [rank0]:E1204 12:58:17.695000 380234 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9825360Z dist init r=0, world=4 2025-12-04T13:38:31.9825500Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9825658Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9825946Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9826100Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9826388Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9826513Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9826794Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9826959Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9827237Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9827386Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9827663Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9827804Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9828097Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9828250Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9828739Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9828862Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9829058Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9829442Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9829557Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9829804Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9829969Z [rank1]:E1204 12:58:17.719000 380235 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9830011Z dist init r=1, world=4 2025-12-04T13:38:31.9830149Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9830312Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9830599Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9830757Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9831043Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9831171Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9831465Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9831617Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9831894Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9832040Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9832329Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9832465Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9832750Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9832898Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9833405Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9833543Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9833738Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9834114Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9834227Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9834444Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9834609Z [rank3]:E1204 12:58:17.727000 380237 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9834654Z dist init r=3, world=4 2025-12-04T13:38:31.9834997Z [rank0]:[W1204 12:58:17.963404852 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9835038Z FAILED [7.8186s] [100%] 2025-12-04T13:38:31.9835041Z 2025-12-04T13:38:31.9835102Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9835218Z _ TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda _ 2025-12-04T13:38:31.9835269Z Traceback (most recent call last): 2025-12-04T13:38:31.9835433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9835493Z self._join_processes(fn) 2025-12-04T13:38:31.9835667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9835727Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9835905Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9835954Z raise RuntimeError(error) 2025-12-04T13:38:31.9836035Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9836085Z Traceback (most recent call last): 2025-12-04T13:38:31.9836247Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9836295Z getattr(self, test_name)() 2025-12-04T13:38:31.9836466Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9836506Z fn() 2025-12-04T13:38:31.9836657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9836703Z method(*args, **kwargs) 2025-12-04T13:38:31.9836861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9836903Z method(*args, **kwargs) 2025-12-04T13:38:31.9837069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9837108Z with policy(): 2025-12-04T13:38:31.9837266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9837308Z raise RuntimeError(msg) 2025-12-04T13:38:31.9837690Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9837692Z 2025-12-04T13:38:31.9837767Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9862784Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9862796Z 2025-12-04T13:38:31.9862910Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9862912Z 2025-12-04T13:38:31.9862913Z 2025-12-04T13:38:31.9862996Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9863091Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9863331Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-206ea0eb46205b47.xml - 2025-12-04T13:38:31.9863396Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9863661Z FAILED [7.8186s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9863713Z Traceback (most recent call last): 2025-12-04T13:38:31.9863882Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9863929Z getattr(self, test_name)() 2025-12-04T13:38:31.9864091Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9864131Z fn() 2025-12-04T13:38:31.9864324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9864367Z method(*args, **kwargs) 2025-12-04T13:38:31.9864520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9864562Z method(*args, **kwargs) 2025-12-04T13:38:31.9864713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9864754Z with policy(): 2025-12-04T13:38:31.9864905Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9864948Z raise RuntimeError(msg) 2025-12-04T13:38:31.9865332Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9865338Z 2025-12-04T13:38:31.9865414Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9865662Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9865664Z 2025-12-04T13:38:31.9865766Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9865832Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9865894Z ======================= 1 failed, 32 deselected in 7.98s ======================= 2025-12-04T13:38:31.9865933Z Got exit code 1 2025-12-04T13:38:31.9865973Z Retrying single test... 2025-12-04T13:38:31.9866182Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9a27cf1bb561c9e5.xml 2025-12-04T13:38:31.9866242Z ============================= test session starts ============================== 2025-12-04T13:38:31.9866360Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9866400Z cachedir: .pytest_cache 2025-12-04T13:38:31.9866565Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9866613Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9866658Z configfile: pytest.ini 2025-12-04T13:38:31.9866823Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9866899Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:31.9867144Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9867191Z Running 1 items in this shard 2025-12-04T13:38:31.9867193Z 2025-12-04T13:38:31.9867517Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda I1204 12:58:22.041000 380551 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 380620 2025-12-04T13:38:31.9867673Z I1204 12:58:22.042000 380551 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 380621 2025-12-04T13:38:31.9867827Z I1204 12:58:22.042000 380551 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 380622 2025-12-04T13:38:31.9867977Z I1204 12:58:22.043000 380551 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 380623 2025-12-04T13:38:31.9868353Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9868402Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9868693Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9868761Z {} 2025-12-04T13:38:31.9868865Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9868942Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9869450Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9869515Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9869907Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9869973Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9870260Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9870338Z {} 2025-12-04T13:38:31.9870444Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9870518Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9871008Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9871069Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9871427Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9871474Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9871763Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9871824Z {} 2025-12-04T13:38:31.9871929Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9872000Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9872494Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9872570Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9872922Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9872971Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9873254Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:31.9873318Z {} 2025-12-04T13:38:31.9873417Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:31.9873505Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:31.9873991Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9874053Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9874219Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9874382Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9874678Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9874844Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9875133Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9875258Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9875540Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9875692Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9875969Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9876118Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9876392Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9876531Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9876822Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9876974Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9877467Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9877583Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9877790Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9878170Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9878286Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9878498Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9878680Z [rank3]:E1204 12:58:28.172000 380623 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9878722Z dist init r=3, world=4 2025-12-04T13:38:31.9878860Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9879034Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9879321Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9879476Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9879794Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9879922Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9880201Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9880351Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9880630Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9880777Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9881056Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9881205Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9881484Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9881631Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9882135Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9882253Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9882449Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9882824Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9882949Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9883162Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9883340Z [rank1]:E1204 12:58:28.176000 380621 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9883381Z dist init r=1, world=4 2025-12-04T13:38:31.9883521Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9883680Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9883968Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9884121Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9884407Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9884530Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9884808Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9884955Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9885232Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9885392Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9885668Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9885804Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9886082Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9886235Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9886746Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9886861Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9887058Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9887441Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9887567Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9887776Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9887942Z [rank0]:E1204 12:58:28.217000 380620 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9887980Z dist init r=0, world=4 2025-12-04T13:38:31.9888121Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9888285Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9888577Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9888733Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9889017Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9889141Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9889419Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9889610Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9889925Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9890071Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9890349Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9890486Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9890791Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9890940Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9891438Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9891571Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9891765Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9892149Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9892261Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9892473Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9892636Z [rank2]:E1204 12:58:28.230000 380622 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9892677Z dist init r=2, world=4 2025-12-04T13:38:31.9893016Z [rank0]:[W1204 12:58:28.495452182 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9893058Z FAILED [7.8180s] [100%] 2025-12-04T13:38:31.9893060Z 2025-12-04T13:38:31.9893120Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9893234Z _ TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda _ 2025-12-04T13:38:31.9893281Z Traceback (most recent call last): 2025-12-04T13:38:31.9893446Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9893492Z self._join_processes(fn) 2025-12-04T13:38:31.9893664Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9893722Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9893909Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9893955Z raise RuntimeError(error) 2025-12-04T13:38:31.9894035Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:31.9894081Z Traceback (most recent call last): 2025-12-04T13:38:31.9894243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9894289Z getattr(self, test_name)() 2025-12-04T13:38:31.9894446Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9894482Z fn() 2025-12-04T13:38:31.9894634Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9894689Z method(*args, **kwargs) 2025-12-04T13:38:31.9894841Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9894885Z method(*args, **kwargs) 2025-12-04T13:38:31.9895037Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9895077Z with policy(): 2025-12-04T13:38:31.9897008Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9897069Z raise RuntimeError(msg) 2025-12-04T13:38:31.9897438Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9897460Z 2025-12-04T13:38:31.9897535Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9897785Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9897788Z 2025-12-04T13:38:31.9897875Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9897878Z 2025-12-04T13:38:31.9897879Z 2025-12-04T13:38:31.9897957Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9898045Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9898282Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9a27cf1bb561c9e5.xml - 2025-12-04T13:38:31.9898346Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9898610Z FAILED [7.8180s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:31.9898660Z Traceback (most recent call last): 2025-12-04T13:38:31.9898825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9898870Z getattr(self, test_name)() 2025-12-04T13:38:31.9899031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9899069Z fn() 2025-12-04T13:38:31.9899221Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9899265Z method(*args, **kwargs) 2025-12-04T13:38:31.9899417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9899460Z method(*args, **kwargs) 2025-12-04T13:38:31.9899658Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9899699Z with policy(): 2025-12-04T13:38:31.9899851Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9899894Z raise RuntimeError(msg) 2025-12-04T13:38:31.9900261Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 13312 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9900266Z 2025-12-04T13:38:31.9900341Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9900609Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9900611Z 2025-12-04T13:38:31.9900697Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9900763Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9900826Z ======================= 1 failed, 32 deselected in 7.96s ======================= 2025-12-04T13:38:31.9900868Z Got exit code 1 2025-12-04T13:38:31.9901080Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda 2025-12-04T13:38:31.9901210Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:31.9901401Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-daad1a9afcbc47ee.xml 2025-12-04T13:38:31.9901479Z ============================= test session starts ============================== 2025-12-04T13:38:31.9901592Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9901635Z cachedir: .pytest_cache 2025-12-04T13:38:31.9901794Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9901843Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9901883Z configfile: pytest.ini 2025-12-04T13:38:31.9902051Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9902127Z collecting ... collected 60 items / 3 deselected / 57 selected 2025-12-04T13:38:31.9902180Z stepcurrent: skipping 3 already run items. 2025-12-04T13:38:31.9902224Z Running 30 items in this shard 2025-12-04T13:38:31.9902228Z 2025-12-04T13:38:31.9902553Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda I1204 12:58:32.635000 380937 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 381006 2025-12-04T13:38:31.9902711Z I1204 12:58:32.635000 380937 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 381007 2025-12-04T13:38:31.9902862Z I1204 12:58:32.636000 380937 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 381008 2025-12-04T13:38:31.9903017Z I1204 12:58:32.636000 380937 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 381009 2025-12-04T13:38:31.9903378Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9903430Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9903794Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9903840Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9904330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9904394Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9904895Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9904957Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9905313Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9905370Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9905855Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9905928Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9906279Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9906328Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9906812Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9906874Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9907019Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9907179Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9907477Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9907633Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9907934Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9908059Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9908339Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9908488Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9908765Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9908926Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9909201Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9909340Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9909655Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9909818Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9910312Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9910440Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9910636Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9911012Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9911129Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9911340Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9911509Z [rank0]:E1204 12:58:38.657000 381006 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9911549Z dist init r=0, world=4 2025-12-04T13:38:31.9911685Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9911849Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9912136Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9912304Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9912587Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9912712Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9912988Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9913150Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9913430Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9913576Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9913858Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9914010Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9914290Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9914454Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9914945Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9915062Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9915257Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9915632Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9915744Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9915960Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9916125Z [rank3]:E1204 12:58:38.666000 381009 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9916167Z dist init r=3, world=4 2025-12-04T13:38:31.9916305Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9916478Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9916768Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9916921Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9917206Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9917328Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9917615Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9917762Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9918038Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9918196Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9918469Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9918619Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9918896Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9919046Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9919534Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9919693Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9919889Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9920257Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9920373Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9920585Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9920763Z [rank2]:E1204 12:58:38.704000 381008 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9920802Z dist init r=2, world=4 2025-12-04T13:38:31.9920945Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9921115Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9921402Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9921562Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9921869Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9921994Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9922279Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9922444Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9922726Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9922888Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9923167Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9923303Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9923586Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9923740Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9924234Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9924356Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9924552Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9924931Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9925059Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9925271Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9925443Z [rank1]:E1204 12:58:38.731000 381007 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9925484Z dist init r=1, world=4 2025-12-04T13:38:31.9925831Z [rank0]:[W1204 12:58:38.836695219 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9925875Z FAILED [7.7165s] [ 3%] 2025-12-04T13:38:31.9925877Z 2025-12-04T13:38:31.9925951Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9926070Z _ TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda _ 2025-12-04T13:38:31.9926125Z Traceback (most recent call last): 2025-12-04T13:38:31.9926291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9926344Z self._join_processes(fn) 2025-12-04T13:38:31.9926520Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9926593Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9926777Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9926825Z raise RuntimeError(error) 2025-12-04T13:38:31.9926912Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9926973Z Traceback (most recent call last): 2025-12-04T13:38:31.9927147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9927194Z getattr(self, test_name)() 2025-12-04T13:38:31.9927364Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9927404Z fn() 2025-12-04T13:38:31.9927567Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9927614Z method(*args, **kwargs) 2025-12-04T13:38:31.9927778Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9927823Z method(*args, **kwargs) 2025-12-04T13:38:31.9927986Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9928028Z with policy(): 2025-12-04T13:38:31.9928193Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9928239Z raise RuntimeError(msg) 2025-12-04T13:38:31.9928622Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9928625Z 2025-12-04T13:38:31.9928706Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9928963Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9928965Z 2025-12-04T13:38:31.9929066Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9929068Z 2025-12-04T13:38:31.9929080Z 2025-12-04T13:38:31.9929160Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9929259Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9929496Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-daad1a9afcbc47ee.xml - 2025-12-04T13:38:31.9929622Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9929889Z FAILED [7.7165s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9929945Z Traceback (most recent call last): 2025-12-04T13:38:31.9930134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9930188Z getattr(self, test_name)() 2025-12-04T13:38:31.9930353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9930399Z fn() 2025-12-04T13:38:31.9930556Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9930608Z method(*args, **kwargs) 2025-12-04T13:38:31.9930770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9930828Z method(*args, **kwargs) 2025-12-04T13:38:31.9930990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9931033Z with policy(): 2025-12-04T13:38:31.9931199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9931259Z raise RuntimeError(msg) 2025-12-04T13:38:31.9931634Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9931637Z 2025-12-04T13:38:31.9931715Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9931970Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9931974Z 2025-12-04T13:38:31.9932065Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9932139Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9932206Z ======================= 1 failed, 3 deselected in 7.88s ======================== 2025-12-04T13:38:31.9932254Z Got exit code 1 2025-12-04T13:38:31.9932299Z Retrying single test... 2025-12-04T13:38:31.9932497Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e061f9bfbb2801c5.xml 2025-12-04T13:38:31.9932566Z ============================= test session starts ============================== 2025-12-04T13:38:31.9932685Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9932738Z cachedir: .pytest_cache 2025-12-04T13:38:31.9932900Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9932956Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9933001Z configfile: pytest.ini 2025-12-04T13:38:31.9933189Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9933269Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:31.9933522Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9933570Z Running 1 items in this shard 2025-12-04T13:38:31.9933572Z 2025-12-04T13:38:31.9933903Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda I1204 12:58:42.854000 381323 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 381392 2025-12-04T13:38:31.9934063Z I1204 12:58:42.855000 381323 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 381393 2025-12-04T13:38:31.9934236Z I1204 12:58:42.855000 381323 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 381394 2025-12-04T13:38:31.9934392Z I1204 12:58:42.856000 381323 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 381395 2025-12-04T13:38:31.9934764Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9934823Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9935329Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9935414Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9935771Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9935829Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9936326Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9936390Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9936752Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9936802Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9937163Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9937213Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9937721Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9937792Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9938281Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9938351Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9938497Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9938680Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9938976Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9939140Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9939436Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9939614Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9939902Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9940071Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9940358Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9940509Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9940798Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9940938Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9941232Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9941392Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9941885Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9942013Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9942226Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9942797Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9942921Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9943138Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9943312Z [rank0]:E1204 12:58:48.839000 381392 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9943372Z dist init r=0, world=4 2025-12-04T13:38:31.9943522Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9943685Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9943984Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9944153Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9944447Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9944593Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9944874Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9945032Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9945310Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9945467Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9945748Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9945897Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9946191Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9946343Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9946866Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9946985Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9947191Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9947565Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9947689Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9947923Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9948091Z [rank1]:E1204 12:58:48.892000 381393 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9948144Z dist init r=1, world=4 2025-12-04T13:38:31.9948285Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9948455Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9948760Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9948926Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9949226Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9949359Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9949680Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9949831Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9950118Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9950270Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9950557Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9950698Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9950988Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9951149Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9951655Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9951780Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9951979Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9952376Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9952494Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9952714Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9952889Z [rank2]:E1204 12:58:48.914000 381394 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9952949Z dist init r=2, world=4 2025-12-04T13:38:31.9953097Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9953261Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9953574Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9953730Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9954024Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9954153Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9954438Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9954595Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9954873Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9955028Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9955306Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9955451Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9955747Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9955904Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9956403Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9956531Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9956737Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9957112Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9957232Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9957464Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9957632Z [rank3]:E1204 12:58:48.925000 381395 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9957693Z dist init r=3, world=4 2025-12-04T13:38:31.9958033Z [rank0]:[W1204 12:58:49.006656144 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9958085Z FAILED [7.6177s] [100%] 2025-12-04T13:38:31.9958088Z 2025-12-04T13:38:31.9958148Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9958272Z _ TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda _ 2025-12-04T13:38:31.9958323Z Traceback (most recent call last): 2025-12-04T13:38:31.9958497Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9958545Z self._join_processes(fn) 2025-12-04T13:38:31.9958730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9958791Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9958981Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9959028Z raise RuntimeError(error) 2025-12-04T13:38:31.9959119Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9959169Z Traceback (most recent call last): 2025-12-04T13:38:31.9959342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9959390Z getattr(self, test_name)() 2025-12-04T13:38:31.9959560Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9959650Z fn() 2025-12-04T13:38:31.9959809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9959880Z method(*args, **kwargs) 2025-12-04T13:38:31.9960035Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9960087Z method(*args, **kwargs) 2025-12-04T13:38:31.9960242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9960291Z with policy(): 2025-12-04T13:38:31.9960450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9960505Z raise RuntimeError(msg) 2025-12-04T13:38:31.9960892Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9960896Z 2025-12-04T13:38:31.9960982Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9961233Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9961235Z 2025-12-04T13:38:31.9961334Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9961349Z 2025-12-04T13:38:31.9961350Z 2025-12-04T13:38:31.9961438Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9961529Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9961775Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e061f9bfbb2801c5.xml - 2025-12-04T13:38:31.9961862Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9962133Z FAILED [7.6177s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:31.9962183Z Traceback (most recent call last): 2025-12-04T13:38:31.9962357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9962405Z getattr(self, test_name)() 2025-12-04T13:38:31.9962574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9962614Z fn() 2025-12-04T13:38:31.9962776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9962822Z method(*args, **kwargs) 2025-12-04T13:38:31.9962987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9963032Z method(*args, **kwargs) 2025-12-04T13:38:31.9963194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9963236Z with policy(): 2025-12-04T13:38:31.9963397Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9963450Z raise RuntimeError(msg) 2025-12-04T13:38:31.9963819Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9963823Z 2025-12-04T13:38:31.9963912Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9964174Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9964177Z 2025-12-04T13:38:31.9964275Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9964342Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9964415Z ======================= 1 failed, 32 deselected in 7.76s ======================= 2025-12-04T13:38:31.9964460Z Got exit code 1 2025-12-04T13:38:31.9964512Z Retrying single test... 2025-12-04T13:38:31.9964705Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4998051b6cba4043.xml 2025-12-04T13:38:31.9964783Z ============================= test session starts ============================== 2025-12-04T13:38:31.9964902Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9964954Z cachedir: .pytest_cache 2025-12-04T13:38:31.9965116Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9965173Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9965217Z configfile: pytest.ini 2025-12-04T13:38:31.9965392Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9965488Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:31.9965733Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9965794Z Running 1 items in this shard 2025-12-04T13:38:31.9965808Z 2025-12-04T13:38:31.9966134Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda I1204 12:58:53.407000 381709 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 381778 2025-12-04T13:38:31.9966300Z I1204 12:58:53.408000 381709 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 381779 2025-12-04T13:38:31.9966456Z I1204 12:58:53.408000 381709 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 381780 2025-12-04T13:38:31.9966617Z I1204 12:58:53.409000 381709 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 381781 2025-12-04T13:38:31.9966980Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9967039Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9967401Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9967451Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9967962Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9968028Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9968540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9968610Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9968966Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9969024Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9969523Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9969641Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9969997Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:31.9970069Z self.encoder = TransformerEncoder( 2025-12-04T13:38:31.9970569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:31.9970646Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:31.9970801Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9970968Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9971271Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9971431Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9971730Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9971859Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9972147Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9972307Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9972586Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9972760Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9973041Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9973187Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9973470Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9973629Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9974147Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9974267Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9974472Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9974859Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9974997Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9975219Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9975387Z [rank2]:E1204 12:58:59.462000 381780 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:31.9975439Z dist init r=2, world=4 2025-12-04T13:38:31.9975580Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9975751Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9976042Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9976207Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9976495Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9976628Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9976909Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9977068Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9977368Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9977518Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9977806Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9977946Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9978243Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9978395Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9978893Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3229614080. 2025-12-04T13:38:31.9979038Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9979239Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9979697Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9979814Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9980034Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9980209Z [rank0]:E1204 12:58:59.505000 381778 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:31.9980253Z dist init r=0, world=4 2025-12-04T13:38:31.9980400Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9980565Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9980859Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9981011Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9981299Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9981423Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9981716Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9981864Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9982137Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9982286Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9982573Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9982714Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9982991Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9983141Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9983648Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3093299200. 2025-12-04T13:38:31.9983776Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9983973Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9984342Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9984459Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9984670Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9984839Z [rank1]:E1204 12:58:59.507000 381779 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:31.9984880Z dist init r=1, world=4 2025-12-04T13:38:31.9985015Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:31.9985178Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:31.9985463Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9985619Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:31.9985913Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9986039Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:31.9986313Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9986462Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9986747Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9986894Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:31.9987171Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9987306Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:31.9987586Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9987746Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:31.9988239Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3026190336. 2025-12-04T13:38:31.9988364Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9988558Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9988930Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9989042Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:31.9989253Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9989416Z [rank3]:E1204 12:58:59.518000 381781 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:31.9989456Z dist init r=3, world=4 2025-12-04T13:38:31.9989828Z [rank0]:[W1204 12:58:59.754491296 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:31.9989868Z FAILED [7.7187s] [100%] 2025-12-04T13:38:31.9989870Z 2025-12-04T13:38:31.9989930Z =================================== FAILURES =================================== 2025-12-04T13:38:31.9990058Z _ TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda _ 2025-12-04T13:38:31.9990108Z Traceback (most recent call last): 2025-12-04T13:38:31.9990270Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:31.9990316Z self._join_processes(fn) 2025-12-04T13:38:31.9990488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:31.9990545Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:31.9990723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:31.9990767Z raise RuntimeError(error) 2025-12-04T13:38:31.9990859Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9990909Z Traceback (most recent call last): 2025-12-04T13:38:31.9991070Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9991115Z getattr(self, test_name)() 2025-12-04T13:38:31.9991272Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9991311Z fn() 2025-12-04T13:38:31.9991464Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9991519Z method(*args, **kwargs) 2025-12-04T13:38:31.9991669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9991712Z method(*args, **kwargs) 2025-12-04T13:38:31.9991866Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9991917Z with policy(): 2025-12-04T13:38:31.9992071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9992112Z raise RuntimeError(msg) 2025-12-04T13:38:31.9992478Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9992482Z 2025-12-04T13:38:31.9992557Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9992806Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9992808Z 2025-12-04T13:38:31.9992898Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9992900Z 2025-12-04T13:38:31.9992903Z 2025-12-04T13:38:31.9992980Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:31.9993066Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:31.9993301Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4998051b6cba4043.xml - 2025-12-04T13:38:31.9993363Z =========================== short test summary info ============================ 2025-12-04T13:38:31.9993622Z FAILED [7.7187s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:31.9993670Z Traceback (most recent call last): 2025-12-04T13:38:31.9993834Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:31.9993891Z getattr(self, test_name)() 2025-12-04T13:38:31.9994052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:31.9994088Z fn() 2025-12-04T13:38:31.9994240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9994283Z method(*args, **kwargs) 2025-12-04T13:38:31.9994435Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:31.9994479Z method(*args, **kwargs) 2025-12-04T13:38:31.9994631Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:31.9994670Z with policy(): 2025-12-04T13:38:31.9994839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:31.9994884Z raise RuntimeError(msg) 2025-12-04T13:38:31.9995246Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3076521984. 2025-12-04T13:38:31.9995251Z 2025-12-04T13:38:31.9995324Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:31.9995583Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9995586Z 2025-12-04T13:38:31.9995672Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:31.9995749Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:31.9995811Z ======================= 1 failed, 32 deselected in 7.86s ======================= 2025-12-04T13:38:31.9995850Z Got exit code 1 2025-12-04T13:38:31.9996045Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda 2025-12-04T13:38:31.9996175Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:31.9996362Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d31fb4781e379879.xml 2025-12-04T13:38:31.9996422Z ============================= test session starts ============================== 2025-12-04T13:38:31.9996533Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:31.9996577Z cachedir: .pytest_cache 2025-12-04T13:38:31.9996737Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:31.9996788Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:31.9996828Z configfile: pytest.ini 2025-12-04T13:38:31.9996994Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:31.9997068Z collecting ... collected 60 items / 4 deselected / 56 selected 2025-12-04T13:38:31.9997120Z stepcurrent: skipping 4 already run items. 2025-12-04T13:38:31.9997166Z Running 29 items in this shard 2025-12-04T13:38:31.9997168Z 2025-12-04T13:38:31.9997477Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda I1204 12:59:03.885000 382095 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 382164 2025-12-04T13:38:31.9997636Z I1204 12:59:03.886000 382095 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 382165 2025-12-04T13:38:31.9997799Z I1204 12:59:03.886000 382095 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 382166 2025-12-04T13:38:31.9997951Z I1204 12:59:03.887000 382095 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 382167 2025-12-04T13:38:31.9998245Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:31.9998298Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:31.9998900Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:31.9998940Z _warn_cpu_init() 2025-12-04T13:38:31.9999230Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:31.9999279Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:31.9999887Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:31.9999952Z _warn_cpu_init() 2025-12-04T13:38:32.0000240Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0000289Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0000573Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0000623Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0001201Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0001243Z _warn_cpu_init() 2025-12-04T13:38:32.0001807Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0001846Z _warn_cpu_init() 2025-12-04T13:38:32.0002138Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0002230Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0002517Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0002593Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0002879Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0002952Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0003250Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0003323Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0004593Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0004743Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0004973Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0005017Z return func(*args, **kwargs) 2025-12-04T13:38:32.0006285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0006410Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0006636Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0006680Z return func(*args, **kwargs) 2025-12-04T13:38:32.0007951Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0008084Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0008313Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0008354Z return func(*args, **kwargs) 2025-12-04T13:38:32.0009647Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0009794Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0010017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0010061Z return func(*args, **kwargs) 2025-12-04T13:38:32.0010282Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0010326Z return func(*args, **kwargs) 2025-12-04T13:38:32.0010549Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0010590Z return func(*args, **kwargs) 2025-12-04T13:38:32.0010810Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0010852Z return func(*args, **kwargs) 2025-12-04T13:38:32.0011070Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0011114Z return func(*args, **kwargs) 2025-12-04T13:38:32.0011407Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0011464Z return func(*args, **kwargs) 2025-12-04T13:38:32.0011611Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0011774Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0012067Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0012224Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0012526Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0012652Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0012936Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0013087Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0013378Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0013528Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0013815Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0013954Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0014230Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0014380Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0014864Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4013948928. 2025-12-04T13:38:32.0014980Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0015179Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0015540Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0015657Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0015885Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0016051Z [rank0]:E1204 12:59:34.948000 382164 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0016092Z dist init r=0, world=4 2025-12-04T13:38:32.0016228Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0016389Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0016684Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0016841Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0017125Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0017250Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0017526Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0017686Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0017974Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0018119Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0018397Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0018533Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0018815Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0018962Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0019443Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3810525184. 2025-12-04T13:38:32.0019558Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0019787Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0020161Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0020274Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0020485Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0020648Z [rank3]:E1204 12:59:34.953000 382167 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0020689Z dist init r=3, world=4 2025-12-04T13:38:32.0020827Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0021000Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0021290Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0021443Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0021730Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0021864Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0022142Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0022300Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0022577Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0022726Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0023001Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0023138Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0023415Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0023565Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0024044Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3860856832. 2025-12-04T13:38:32.0024160Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0024463Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0024821Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0024935Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0025146Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0025323Z [rank2]:E1204 12:59:34.991000 382166 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0025362Z dist init r=2, world=4 2025-12-04T13:38:32.0025500Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0025660Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0025946Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0026110Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0026393Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0026539Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0026815Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0026962Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0027238Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0027386Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0027662Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0027797Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0028074Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0028221Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0028707Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3877634048. 2025-12-04T13:38:32.0028823Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0029018Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0029377Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0029488Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0029767Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0029929Z [rank1]:E1204 12:59:35.000000 382165 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0029970Z dist init r=1, world=4 2025-12-04T13:38:32.0030302Z [rank0]:[W1204 12:59:35.133500092 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0030358Z FAILED [32.9528s] [ 3%] 2025-12-04T13:38:32.0030360Z 2025-12-04T13:38:32.0030419Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0030518Z __ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda ___ 2025-12-04T13:38:32.0030582Z Traceback (most recent call last): 2025-12-04T13:38:32.0030747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0030793Z self._join_processes(fn) 2025-12-04T13:38:32.0030964Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0031019Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0031199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0031247Z raise RuntimeError(error) 2025-12-04T13:38:32.0031326Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0031373Z Traceback (most recent call last): 2025-12-04T13:38:32.0031534Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0031578Z getattr(self, test_name)() 2025-12-04T13:38:32.0031736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0031772Z fn() 2025-12-04T13:38:32.0031923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0031968Z method(*args, **kwargs) 2025-12-04T13:38:32.0032117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0032161Z method(*args, **kwargs) 2025-12-04T13:38:32.0032311Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0032351Z with policy(): 2025-12-04T13:38:32.0032503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0032546Z raise RuntimeError(msg) 2025-12-04T13:38:32.0032915Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4013948928. 2025-12-04T13:38:32.0032918Z 2025-12-04T13:38:32.0032994Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0033227Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0033230Z 2025-12-04T13:38:32.0033317Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0033320Z 2025-12-04T13:38:32.0033321Z 2025-12-04T13:38:32.0033407Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0033498Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0033734Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d31fb4781e379879.xml - 2025-12-04T13:38:32.0033794Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0034045Z FAILED [32.9528s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0034102Z Traceback (most recent call last): 2025-12-04T13:38:32.0034267Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0034310Z getattr(self, test_name)() 2025-12-04T13:38:32.0034470Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0034518Z fn() 2025-12-04T13:38:32.0034669Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0034711Z method(*args, **kwargs) 2025-12-04T13:38:32.0034862Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0034903Z method(*args, **kwargs) 2025-12-04T13:38:32.0035054Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0035093Z with policy(): 2025-12-04T13:38:32.0035244Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0035287Z raise RuntimeError(msg) 2025-12-04T13:38:32.0035640Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4013948928. 2025-12-04T13:38:32.0035644Z 2025-12-04T13:38:32.0035720Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0035953Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0035956Z 2025-12-04T13:38:32.0036041Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0036105Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0036166Z ======================= 1 failed, 4 deselected in 33.11s ======================= 2025-12-04T13:38:32.0036204Z Got exit code 1 2025-12-04T13:38:32.0036246Z Retrying single test... 2025-12-04T13:38:32.0036445Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-10252b0d5be41435.xml 2025-12-04T13:38:32.0036503Z ============================= test session starts ============================== 2025-12-04T13:38:32.0036617Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0036658Z cachedir: .pytest_cache 2025-12-04T13:38:32.0036817Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0036863Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0036905Z configfile: pytest.ini 2025-12-04T13:38:32.0037068Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0037143Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0037375Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0037420Z Running 1 items in this shard 2025-12-04T13:38:32.0037422Z 2025-12-04T13:38:32.0037729Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda I1204 12:59:39.501000 382497 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 382566 2025-12-04T13:38:32.0037885Z I1204 12:59:39.501000 382497 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 382567 2025-12-04T13:38:32.0038049Z I1204 12:59:39.502000 382497 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 382568 2025-12-04T13:38:32.0038198Z I1204 12:59:39.502000 382497 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 382569 2025-12-04T13:38:32.0038503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0038554Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0038840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0038888Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0039473Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0039512Z _warn_cpu_init() 2025-12-04T13:38:32.0039814Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0039866Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0040436Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0040478Z _warn_cpu_init() 2025-12-04T13:38:32.0041057Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0041097Z _warn_cpu_init() 2025-12-04T13:38:32.0041392Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0041439Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0042028Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0042066Z _warn_cpu_init() 2025-12-04T13:38:32.0042353Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0042450Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0042737Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0042827Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0043114Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0043187Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0043470Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0043544Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0044819Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0044945Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0045187Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0045231Z return func(*args, **kwargs) 2025-12-04T13:38:32.0046493Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0046618Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0047857Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0047995Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0048222Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0048265Z return func(*args, **kwargs) 2025-12-04T13:38:32.0048489Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0048533Z return func(*args, **kwargs) 2025-12-04T13:38:32.0049856Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0049993Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0050221Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0050262Z return func(*args, **kwargs) 2025-12-04T13:38:32.0050488Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0050529Z return func(*args, **kwargs) 2025-12-04T13:38:32.0050752Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0050791Z return func(*args, **kwargs) 2025-12-04T13:38:32.0051024Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0051067Z return func(*args, **kwargs) 2025-12-04T13:38:32.0051286Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0051325Z return func(*args, **kwargs) 2025-12-04T13:38:32.0051619Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0051672Z return func(*args, **kwargs) 2025-12-04T13:38:32.0051820Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0051985Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0052290Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0052447Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0052732Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0052860Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0053138Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0053288Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0053563Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0053712Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0053992Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0054129Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0054419Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0054568Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0055053Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3810525184. 2025-12-04T13:38:32.0055184Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0055382Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0055743Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0055856Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0056081Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0056247Z [rank3]:E1204 13:00:10.576000 382569 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0056301Z dist init r=3, world=4 2025-12-04T13:38:32.0056439Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0056602Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0056892Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0057045Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0057335Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0057460Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0057741Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0057888Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0058169Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0058316Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0058605Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0058743Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0059018Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0059169Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0059709Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3860856832. 2025-12-04T13:38:32.0059826Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0060022Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0060377Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0060503Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0060728Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0060895Z [rank2]:E1204 13:00:10.618000 382568 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0060934Z dist init r=2, world=4 2025-12-04T13:38:32.0061072Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0061232Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0061522Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0061677Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0061966Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0062091Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0062367Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0062517Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0062806Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0062957Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0063232Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0063366Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0063645Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0063803Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0064284Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3877634048. 2025-12-04T13:38:32.0064398Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0064602Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0064961Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0065082Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0065294Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0065457Z [rank1]:E1204 13:00:10.626000 382567 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0065499Z dist init r=1, world=4 2025-12-04T13:38:32.0065636Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0065798Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0066084Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0066239Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0066527Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0066650Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0066928Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0067085Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0067361Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0067506Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0067783Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0067932Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0068211Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0068359Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0068834Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4013948928. 2025-12-04T13:38:32.0068960Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0069165Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0069522Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0069663Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0069874Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0070038Z [rank0]:E1204 13:00:10.628000 382566 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0070077Z dist init r=0, world=4 2025-12-04T13:38:32.0070415Z [rank0]:[W1204 13:00:10.889231270 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0070456Z FAILED [32.9531s] [100%] 2025-12-04T13:38:32.0070458Z 2025-12-04T13:38:32.0070517Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0070617Z __ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda ___ 2025-12-04T13:38:32.0070666Z Traceback (most recent call last): 2025-12-04T13:38:32.0070830Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0070876Z self._join_processes(fn) 2025-12-04T13:38:32.0071050Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0071105Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0071296Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0071340Z raise RuntimeError(error) 2025-12-04T13:38:32.0071422Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0071467Z Traceback (most recent call last): 2025-12-04T13:38:32.0071629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0071672Z getattr(self, test_name)() 2025-12-04T13:38:32.0071832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0071867Z fn() 2025-12-04T13:38:32.0072032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0072075Z method(*args, **kwargs) 2025-12-04T13:38:32.0072229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0072270Z method(*args, **kwargs) 2025-12-04T13:38:32.0072423Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0072459Z with policy(): 2025-12-04T13:38:32.0072614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0072667Z raise RuntimeError(msg) 2025-12-04T13:38:32.0073023Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3810525184. 2025-12-04T13:38:32.0073037Z 2025-12-04T13:38:32.0073116Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0073348Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0073351Z 2025-12-04T13:38:32.0073440Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0073442Z 2025-12-04T13:38:32.0073443Z 2025-12-04T13:38:32.0073518Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0073610Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0073842Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-10252b0d5be41435.xml - 2025-12-04T13:38:32.0073907Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0074156Z FAILED [32.9531s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0074203Z Traceback (most recent call last): 2025-12-04T13:38:32.0074366Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0074410Z getattr(self, test_name)() 2025-12-04T13:38:32.0074569Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0074606Z fn() 2025-12-04T13:38:32.0074761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0074801Z method(*args, **kwargs) 2025-12-04T13:38:32.0074956Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0075014Z method(*args, **kwargs) 2025-12-04T13:38:32.0075167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0075203Z with policy(): 2025-12-04T13:38:32.0075356Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0075398Z raise RuntimeError(msg) 2025-12-04T13:38:32.0075752Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3810525184. 2025-12-04T13:38:32.0075755Z 2025-12-04T13:38:32.0075828Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0076073Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0076075Z 2025-12-04T13:38:32.0076160Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0076225Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0076287Z ====================== 1 failed, 32 deselected in 33.11s ======================= 2025-12-04T13:38:32.0076326Z Got exit code 1 2025-12-04T13:38:32.0076376Z Retrying single test... 2025-12-04T13:38:32.0076567Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ad8d28cd3d2f972.xml 2025-12-04T13:38:32.0076627Z ============================= test session starts ============================== 2025-12-04T13:38:32.0076740Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0076793Z cachedir: .pytest_cache 2025-12-04T13:38:32.0076951Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0076999Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0077039Z configfile: pytest.ini 2025-12-04T13:38:32.0077204Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0077277Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0077504Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0077546Z Running 1 items in this shard 2025-12-04T13:38:32.0077548Z 2025-12-04T13:38:32.0077859Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda I1204 13:00:15.020000 382899 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 382968 2025-12-04T13:38:32.0078015Z I1204 13:00:15.020000 382899 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 382969 2025-12-04T13:38:32.0078168Z I1204 13:00:15.021000 382899 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 382970 2025-12-04T13:38:32.0078320Z I1204 13:00:15.021000 382899 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 382971 2025-12-04T13:38:32.0078613Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0078666Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0078962Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0079013Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0079627Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0079669Z _warn_cpu_init() 2025-12-04T13:38:32.0079968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0080019Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0080593Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0080647Z _warn_cpu_init() 2025-12-04T13:38:32.0081222Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0081275Z _warn_cpu_init() 2025-12-04T13:38:32.0081563Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0081613Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0082182Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0082222Z _warn_cpu_init() 2025-12-04T13:38:32.0082509Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0082589Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0082873Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0082952Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0083238Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0083314Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0083617Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0083689Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0084963Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0085092Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0086346Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0086492Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0087781Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0087903Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0088133Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0088181Z return func(*args, **kwargs) 2025-12-04T13:38:32.0089460Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0089620Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0089849Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0089891Z return func(*args, **kwargs) 2025-12-04T13:38:32.0090117Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0090173Z return func(*args, **kwargs) 2025-12-04T13:38:32.0090398Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0090439Z return func(*args, **kwargs) 2025-12-04T13:38:32.0090868Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0090910Z return func(*args, **kwargs) 2025-12-04T13:38:32.0091129Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0091171Z return func(*args, **kwargs) 2025-12-04T13:38:32.0091389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0091432Z return func(*args, **kwargs) 2025-12-04T13:38:32.0091654Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0091697Z return func(*args, **kwargs) 2025-12-04T13:38:32.0091990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0092033Z return func(*args, **kwargs) 2025-12-04T13:38:32.0092177Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0092343Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0092635Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0092792Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0093096Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0093222Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0093506Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0093656Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0093946Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0094094Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0094375Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0094514Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0094804Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0094954Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0095447Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4013948928. 2025-12-04T13:38:32.0095566Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0095762Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0096127Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0096244Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0096456Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0096623Z [rank0]:E1204 13:00:46.055000 382968 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0096663Z dist init r=0, world=4 2025-12-04T13:38:32.0096803Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0096963Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0097263Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0097417Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0097706Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0097833Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0098120Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0098271Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0098545Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0098693Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0098977Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0099115Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0099404Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0099550Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0100062Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3860856832. 2025-12-04T13:38:32.0100180Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0100381Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0100739Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0100854Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0101069Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0101232Z [rank2]:E1204 13:00:46.064000 382970 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0101273Z dist init r=2, world=4 2025-12-04T13:38:32.0101424Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0101585Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0101869Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0102027Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0102312Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0102450Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0102728Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0102875Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0103153Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0103311Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0103588Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0103737Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0104016Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0104168Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0104648Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3810525184. 2025-12-04T13:38:32.0104764Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0104957Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0105314Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0105428Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0105642Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0105817Z [rank3]:E1204 13:00:46.066000 382971 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0105857Z dist init r=3, world=4 2025-12-04T13:38:32.0105997Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0106154Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0106444Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0106605Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0106893Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0107015Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0107293Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0107458Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0107734Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0107898Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0108176Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0123456Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0123771Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0123932Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0124419Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3877634048. 2025-12-04T13:38:32.0124541Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0124744Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0125114Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0125286Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0125504Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0125673Z [rank1]:E1204 13:00:46.110000 382969 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0125717Z dist init r=1, world=4 2025-12-04T13:38:32.0126059Z [rank0]:[W1204 13:00:46.246017877 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0126102Z FAILED [32.8533s] [100%] 2025-12-04T13:38:32.0126124Z 2025-12-04T13:38:32.0126188Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0126291Z __ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda ___ 2025-12-04T13:38:32.0126344Z Traceback (most recent call last): 2025-12-04T13:38:32.0126514Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0126562Z self._join_processes(fn) 2025-12-04T13:38:32.0126739Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0126811Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0126996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0127041Z raise RuntimeError(error) 2025-12-04T13:38:32.0127127Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0127192Z Traceback (most recent call last): 2025-12-04T13:38:32.0127362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0127406Z getattr(self, test_name)() 2025-12-04T13:38:32.0127570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0127606Z fn() 2025-12-04T13:38:32.0127763Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0127806Z method(*args, **kwargs) 2025-12-04T13:38:32.0127962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0128003Z method(*args, **kwargs) 2025-12-04T13:38:32.0128161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0128201Z with policy(): 2025-12-04T13:38:32.0128361Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0128402Z raise RuntimeError(msg) 2025-12-04T13:38:32.0128762Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4013948928. 2025-12-04T13:38:32.0128766Z 2025-12-04T13:38:32.0128843Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0129081Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0129085Z 2025-12-04T13:38:32.0129178Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0129192Z 2025-12-04T13:38:32.0129194Z 2025-12-04T13:38:32.0129273Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0129366Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0129659Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ad8d28cd3d2f972.xml - 2025-12-04T13:38:32.0129723Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0129979Z FAILED [32.8533s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0130028Z Traceback (most recent call last): 2025-12-04T13:38:32.0130214Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0130266Z getattr(self, test_name)() 2025-12-04T13:38:32.0130429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0130467Z fn() 2025-12-04T13:38:32.0130622Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0130665Z method(*args, **kwargs) 2025-12-04T13:38:32.0130824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0130877Z method(*args, **kwargs) 2025-12-04T13:38:32.0131033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0131071Z with policy(): 2025-12-04T13:38:32.0131245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0131288Z raise RuntimeError(msg) 2025-12-04T13:38:32.0131644Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4013948928. 2025-12-04T13:38:32.0131647Z 2025-12-04T13:38:32.0131722Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0131959Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0131961Z 2025-12-04T13:38:32.0132049Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0132118Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0132183Z ====================== 1 failed, 32 deselected in 33.00s ======================= 2025-12-04T13:38:32.0132225Z Got exit code 1 2025-12-04T13:38:32.0132412Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda 2025-12-04T13:38:32.0132542Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0132737Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4b715ed61ff2d6c7.xml 2025-12-04T13:38:32.0132798Z ============================= test session starts ============================== 2025-12-04T13:38:32.0132916Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0132958Z cachedir: .pytest_cache 2025-12-04T13:38:32.0133123Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0133184Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0133229Z configfile: pytest.ini 2025-12-04T13:38:32.0133394Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0133472Z collecting ... collected 60 items / 5 deselected / 55 selected 2025-12-04T13:38:32.0133527Z stepcurrent: skipping 5 already run items. 2025-12-04T13:38:32.0133574Z Running 28 items in this shard 2025-12-04T13:38:32.0133577Z 2025-12-04T13:38:32.0133887Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda I1204 13:00:50.334000 383301 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 383370 2025-12-04T13:38:32.0134057Z I1204 13:00:50.335000 383301 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 383371 2025-12-04T13:38:32.0134213Z I1204 13:00:50.335000 383301 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 383372 2025-12-04T13:38:32.0134367Z I1204 13:00:50.336000 383301 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 383373 2025-12-04T13:38:32.0134670Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0134734Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0135330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0135379Z _warn_cpu_init() 2025-12-04T13:38:32.0135672Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0135723Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0136299Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0136341Z _warn_cpu_init() 2025-12-04T13:38:32.0136630Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0136684Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0137259Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0137300Z _warn_cpu_init() 2025-12-04T13:38:32.0137599Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0137652Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0138225Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0138265Z _warn_cpu_init() 2025-12-04T13:38:32.0138565Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0138648Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0138938Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0139014Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0139305Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0139392Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0139723Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0139815Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0140108Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0140155Z return func(*args, **kwargs) 2025-12-04T13:38:32.0140385Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0140433Z return func(*args, **kwargs) 2025-12-04T13:38:32.0140655Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0140701Z return func(*args, **kwargs) 2025-12-04T13:38:32.0140925Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0140971Z return func(*args, **kwargs) 2025-12-04T13:38:32.0141195Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0141236Z return func(*args, **kwargs) 2025-12-04T13:38:32.0141460Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0141502Z return func(*args, **kwargs) 2025-12-04T13:38:32.0141724Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0141766Z return func(*args, **kwargs) 2025-12-04T13:38:32.0142001Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0142043Z return func(*args, **kwargs) 2025-12-04T13:38:32.0142264Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0142305Z return func(*args, **kwargs) 2025-12-04T13:38:32.0142454Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0142622Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0142931Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0143093Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0143378Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0143508Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0143800Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0143953Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0144239Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0144390Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0144665Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0144807Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0145092Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0145241Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0145721Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 0. CUDA driver allocated memory was 2453667840 and is now 3990880256. 2025-12-04T13:38:32.0145839Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0146038Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0146418Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0146532Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0146748Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0146913Z [rank0]:E1204 13:01:27.673000 383370 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0146956Z dist init r=0, world=4 2025-12-04T13:38:32.0147105Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0147270Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0147558Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0147716Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0148014Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0148138Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0148431Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0148578Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0148855Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0149002Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0149282Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0149422Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0149749Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0149900Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0150378Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3854565376. 2025-12-04T13:38:32.0150508Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0150705Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0151066Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0151183Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0151397Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0151584Z [rank1]:E1204 13:01:27.675000 383371 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0151624Z dist init r=1, world=4 2025-12-04T13:38:32.0151767Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0151925Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0152216Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0152382Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0152670Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0152812Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0153088Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0153238Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0153518Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0153673Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0153951Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0154090Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0154369Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0154517Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0155003Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 2. CUDA driver allocated memory was 2300575744 and is now 3837788160. 2025-12-04T13:38:32.0155119Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0155317Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0155677Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0155801Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0156018Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0156182Z [rank2]:E1204 13:01:27.681000 383372 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0156224Z dist init r=2, world=4 2025-12-04T13:38:32.0156361Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0156535Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0156821Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0156988Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0157271Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0157397Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0157677Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0157827Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0158108Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0158255Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0158531Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0158668Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0158949Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0159110Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0159624Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 3. CUDA driver allocated memory was 2250244096 and is now 3787456512. 2025-12-04T13:38:32.0159745Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0159939Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0160317Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0160431Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0160645Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0160824Z [rank3]:E1204 13:01:27.745000 383373 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0160864Z dist init r=3, world=4 2025-12-04T13:38:32.0161203Z [rank0]:[W1204 13:01:27.854831028 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0161263Z FAILED [39.1581s] [ 3%] 2025-12-04T13:38:32.0161266Z 2025-12-04T13:38:32.0161327Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0161429Z ___ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda ___ 2025-12-04T13:38:32.0161480Z Traceback (most recent call last): 2025-12-04T13:38:32.0161644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0161692Z self._join_processes(fn) 2025-12-04T13:38:32.0161867Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0161924Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0162104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0162154Z raise RuntimeError(error) 2025-12-04T13:38:32.0162234Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0162282Z Traceback (most recent call last): 2025-12-04T13:38:32.0162443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0162490Z getattr(self, test_name)() 2025-12-04T13:38:32.0162652Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0162690Z fn() 2025-12-04T13:38:32.0162846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0162887Z method(*args, **kwargs) 2025-12-04T13:38:32.0163044Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0163085Z method(*args, **kwargs) 2025-12-04T13:38:32.0163255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0163295Z with policy(): 2025-12-04T13:38:32.0163451Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0163493Z raise RuntimeError(msg) 2025-12-04T13:38:32.0163845Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 0. CUDA driver allocated memory was 2453667840 and is now 3990880256. 2025-12-04T13:38:32.0163848Z 2025-12-04T13:38:32.0163924Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0164174Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0164178Z 2025-12-04T13:38:32.0164267Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0164272Z 2025-12-04T13:38:32.0164332Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0164381Z Traceback (most recent call last): 2025-12-04T13:38:32.0164544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0164602Z getattr(self, test_name)() 2025-12-04T13:38:32.0164760Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0164799Z fn() 2025-12-04T13:38:32.0164954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0165009Z method(*args, **kwargs) 2025-12-04T13:38:32.0165163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0165206Z method(*args, **kwargs) 2025-12-04T13:38:32.0165358Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0165400Z with policy(): 2025-12-04T13:38:32.0165553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0165599Z raise RuntimeError(msg) 2025-12-04T13:38:32.0165948Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3854565376. 2025-12-04T13:38:32.0165952Z 2025-12-04T13:38:32.0166035Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0166266Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0166272Z 2025-12-04T13:38:32.0166359Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0166361Z 2025-12-04T13:38:32.0166363Z 2025-12-04T13:38:32.0166444Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0166534Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0166771Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4b715ed61ff2d6c7.xml - 2025-12-04T13:38:32.0166832Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0167095Z FAILED [39.1581s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0167284Z Traceback (most recent call last): 2025-12-04T13:38:32.0167453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0167495Z getattr(self, test_name)() 2025-12-04T13:38:32.0167660Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0167697Z fn() 2025-12-04T13:38:32.0167853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0167895Z method(*args, **kwargs) 2025-12-04T13:38:32.0168062Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0168109Z method(*args, **kwargs) 2025-12-04T13:38:32.0168261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0168304Z with policy(): 2025-12-04T13:38:32.0168459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0168503Z raise RuntimeError(msg) 2025-12-04T13:38:32.0168856Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 0. CUDA driver allocated memory was 2453667840 and is now 3990880256. 2025-12-04T13:38:32.0168869Z 2025-12-04T13:38:32.0168946Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0169179Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0169191Z 2025-12-04T13:38:32.0169281Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0169283Z 2025-12-04T13:38:32.0169342Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0169391Z Traceback (most recent call last): 2025-12-04T13:38:32.0169554Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0169638Z getattr(self, test_name)() 2025-12-04T13:38:32.0169803Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0169838Z fn() 2025-12-04T13:38:32.0169994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0170035Z method(*args, **kwargs) 2025-12-04T13:38:32.0170193Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0170233Z method(*args, **kwargs) 2025-12-04T13:38:32.0170389Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0170426Z with policy(): 2025-12-04T13:38:32.0170580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0170622Z raise RuntimeError(msg) 2025-12-04T13:38:32.0170973Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3854565376. 2025-12-04T13:38:32.0170976Z 2025-12-04T13:38:32.0171049Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0171297Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0171299Z 2025-12-04T13:38:32.0171387Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0171458Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0171524Z ======================= 1 failed, 5 deselected in 39.32s ======================= 2025-12-04T13:38:32.0171562Z Got exit code 1 2025-12-04T13:38:32.0171606Z Retrying single test... 2025-12-04T13:38:32.0171795Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-69f8ccb79d38e88f.xml 2025-12-04T13:38:32.0171870Z ============================= test session starts ============================== 2025-12-04T13:38:32.0171988Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0172032Z cachedir: .pytest_cache 2025-12-04T13:38:32.0172194Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0172244Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0172285Z configfile: pytest.ini 2025-12-04T13:38:32.0172451Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0172539Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0172764Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0172808Z Running 1 items in this shard 2025-12-04T13:38:32.0172823Z 2025-12-04T13:38:32.0173134Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda I1204 13:01:32.158000 383703 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 383772 2025-12-04T13:38:32.0173290Z I1204 13:01:32.159000 383703 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 383773 2025-12-04T13:38:32.0173446Z I1204 13:01:32.159000 383703 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 383774 2025-12-04T13:38:32.0173603Z I1204 13:01:32.160000 383703 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 383775 2025-12-04T13:38:32.0173896Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0173952Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0174530Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0174572Z _warn_cpu_init() 2025-12-04T13:38:32.0174861Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0174915Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0175498Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0175537Z _warn_cpu_init() 2025-12-04T13:38:32.0175827Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0175878Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0176177Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0176226Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0176801Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0176859Z _warn_cpu_init() 2025-12-04T13:38:32.0177428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0177481Z _warn_cpu_init() 2025-12-04T13:38:32.0177769Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0177849Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0178145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0178225Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0178515Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0178589Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0178876Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0178948Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0179243Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0179287Z return func(*args, **kwargs) 2025-12-04T13:38:32.0179519Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0179562Z return func(*args, **kwargs) 2025-12-04T13:38:32.0179840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0179883Z return func(*args, **kwargs) 2025-12-04T13:38:32.0180108Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0180149Z return func(*args, **kwargs) 2025-12-04T13:38:32.0180373Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0180414Z return func(*args, **kwargs) 2025-12-04T13:38:32.0180652Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0180697Z return func(*args, **kwargs) 2025-12-04T13:38:32.0180917Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0180961Z return func(*args, **kwargs) 2025-12-04T13:38:32.0181177Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0181231Z return func(*args, **kwargs) 2025-12-04T13:38:32.0181452Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0181493Z return func(*args, **kwargs) 2025-12-04T13:38:32.0181639Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0181818Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0182110Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0182266Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0182557Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0182682Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0182966Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0183118Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0183398Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0183546Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0183833Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0183973Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0184249Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0184398Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0184884Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 3. CUDA driver allocated memory was 2250244096 and is now 3787456512. 2025-12-04T13:38:32.0185005Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0185200Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0185564Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0185720Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0185932Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0186112Z [rank3]:E1204 13:02:09.629000 383775 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0186151Z dist init r=3, world=4 2025-12-04T13:38:32.0186290Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0186448Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0186738Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0186891Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0187181Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0187306Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0187583Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0187734Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0188010Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0188168Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0188445Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0188583Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0188862Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0189008Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0189494Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 2. CUDA driver allocated memory was 2300575744 and is now 3837788160. 2025-12-04T13:38:32.0189642Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0189841Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0190218Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0190343Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0190554Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0190717Z [rank2]:E1204 13:02:09.668000 383774 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0190757Z dist init r=2, world=4 2025-12-04T13:38:32.0190894Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0191055Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0191342Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0191497Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0191782Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0191904Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0192183Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0192331Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0192622Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0192769Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0193046Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0193182Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0193483Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0193633Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0194106Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3854565376. 2025-12-04T13:38:32.0194231Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0194426Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0194795Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0194909Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0195121Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0195286Z [rank1]:E1204 13:02:09.670000 383773 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0195324Z dist init r=1, world=4 2025-12-04T13:38:32.0195463Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0195624Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0195913Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0196067Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0196355Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0196480Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0196769Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0196916Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0197191Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0197341Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0197628Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0197767Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0198044Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0198195Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0198679Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 0. CUDA driver allocated memory was 2453667840 and is now 3990880256. 2025-12-04T13:38:32.0198804Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0199002Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0199356Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0199470Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0199719Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0199885Z [rank0]:E1204 13:02:09.683000 383772 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0199925Z dist init r=0, world=4 2025-12-04T13:38:32.0200265Z [rank0]:[W1204 13:02:09.961694222 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0200307Z FAILED [39.4593s] [100%] 2025-12-04T13:38:32.0200309Z 2025-12-04T13:38:32.0200365Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0200466Z ___ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda ___ 2025-12-04T13:38:32.0200512Z Traceback (most recent call last): 2025-12-04T13:38:32.0200679Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0200723Z self._join_processes(fn) 2025-12-04T13:38:32.0200913Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0200968Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0201148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0201191Z raise RuntimeError(error) 2025-12-04T13:38:32.0201272Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0201318Z Traceback (most recent call last): 2025-12-04T13:38:32.0201481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0201525Z getattr(self, test_name)() 2025-12-04T13:38:32.0201696Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0201734Z fn() 2025-12-04T13:38:32.0201887Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0201930Z method(*args, **kwargs) 2025-12-04T13:38:32.0202080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0202123Z method(*args, **kwargs) 2025-12-04T13:38:32.0202273Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0202326Z with policy(): 2025-12-04T13:38:32.0202479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0202523Z raise RuntimeError(msg) 2025-12-04T13:38:32.0202875Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 3. CUDA driver allocated memory was 2250244096 and is now 3787456512. 2025-12-04T13:38:32.0202894Z 2025-12-04T13:38:32.0202973Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0203203Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0203207Z 2025-12-04T13:38:32.0203294Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0203297Z 2025-12-04T13:38:32.0203299Z 2025-12-04T13:38:32.0203375Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0203465Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0203700Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-69f8ccb79d38e88f.xml - 2025-12-04T13:38:32.0203761Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0204011Z FAILED [39.4593s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0204056Z Traceback (most recent call last): 2025-12-04T13:38:32.0204222Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0204265Z getattr(self, test_name)() 2025-12-04T13:38:32.0204428Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0204462Z fn() 2025-12-04T13:38:32.0204618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0204668Z method(*args, **kwargs) 2025-12-04T13:38:32.0204825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0204864Z method(*args, **kwargs) 2025-12-04T13:38:32.0205015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0205051Z with policy(): 2025-12-04T13:38:32.0205205Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0205249Z raise RuntimeError(msg) 2025-12-04T13:38:32.0205607Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 3. CUDA driver allocated memory was 2250244096 and is now 3787456512. 2025-12-04T13:38:32.0205610Z 2025-12-04T13:38:32.0205687Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0205919Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0205921Z 2025-12-04T13:38:32.0206009Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0206072Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0206147Z ====================== 1 failed, 32 deselected in 39.62s ======================= 2025-12-04T13:38:32.0206183Z Got exit code 1 2025-12-04T13:38:32.0206225Z Retrying single test... 2025-12-04T13:38:32.0206413Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0905abac027446cf.xml 2025-12-04T13:38:32.0206485Z ============================= test session starts ============================== 2025-12-04T13:38:32.0206599Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0206642Z cachedir: .pytest_cache 2025-12-04T13:38:32.0206801Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0206848Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0206888Z configfile: pytest.ini 2025-12-04T13:38:32.0207053Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0207129Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0207354Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0207401Z Running 1 items in this shard 2025-12-04T13:38:32.0207403Z 2025-12-04T13:38:32.0207712Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda I1204 13:02:14.070000 384105 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 384174 2025-12-04T13:38:32.0207869Z I1204 13:02:14.071000 384105 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 384175 2025-12-04T13:38:32.0208020Z I1204 13:02:14.071000 384105 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 384176 2025-12-04T13:38:32.0208176Z I1204 13:02:14.072000 384105 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 384177 2025-12-04T13:38:32.0208469Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0208523Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0209121Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0209160Z _warn_cpu_init() 2025-12-04T13:38:32.0209450Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0209498Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0210143Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0210182Z _warn_cpu_init() 2025-12-04T13:38:32.0210465Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0210531Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0210815Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0210877Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0211455Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0213977Z _warn_cpu_init() 2025-12-04T13:38:32.0214548Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0214588Z _warn_cpu_init() 2025-12-04T13:38:32.0214875Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0214953Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0215240Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0215332Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0215640Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0215713Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0216000Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0216071Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0216364Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0216409Z return func(*args, **kwargs) 2025-12-04T13:38:32.0216653Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0216696Z return func(*args, **kwargs) 2025-12-04T13:38:32.0216922Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0216966Z return func(*args, **kwargs) 2025-12-04T13:38:32.0217185Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0217239Z return func(*args, **kwargs) 2025-12-04T13:38:32.0217460Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0217502Z return func(*args, **kwargs) 2025-12-04T13:38:32.0217725Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0217766Z return func(*args, **kwargs) 2025-12-04T13:38:32.0217985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0218027Z return func(*args, **kwargs) 2025-12-04T13:38:32.0218246Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0218364Z return func(*args, **kwargs) 2025-12-04T13:38:32.0218580Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0218622Z return func(*args, **kwargs) 2025-12-04T13:38:32.0218769Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0218935Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0219229Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0219388Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0219722Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0219862Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0220144Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0220292Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0220572Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0220722Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0221012Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0221150Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0221426Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0221590Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0222072Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 3. CUDA driver allocated memory was 2250244096 and is now 3787456512. 2025-12-04T13:38:32.0222191Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0222390Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0222748Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0222892Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0223105Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0223273Z [rank3]:E1204 13:02:51.310000 384177 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0223313Z dist init r=3, world=4 2025-12-04T13:38:32.0223453Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0223611Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0223902Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0224057Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0224353Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0224480Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0224757Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0224908Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0225196Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0225345Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0225623Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0225759Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0226056Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0226206Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0226687Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 2. CUDA driver allocated memory was 2300575744 and is now 3837788160. 2025-12-04T13:38:32.0226803Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0227000Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0227369Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0227482Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0227695Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0227858Z [rank2]:E1204 13:02:51.320000 384176 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0227899Z dist init r=2, world=4 2025-12-04T13:38:32.0228034Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0228196Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0228493Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0228646Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0228932Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0229057Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0229335Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0229491Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0229897Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0230042Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0230320Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0230472Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0230752Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0230902Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0231375Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 0. CUDA driver allocated memory was 2453667840 and is now 3990880256. 2025-12-04T13:38:32.0231505Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0231702Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0232062Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0232177Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0232388Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0232554Z [rank0]:E1204 13:02:51.361000 384174 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0232592Z dist init r=0, world=4 2025-12-04T13:38:32.0232730Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0232903Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0233189Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0233342Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0233627Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0233765Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0234044Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0234193Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0234468Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0234626Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0234904Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0235040Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0235318Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0235465Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0235952Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3854565376. 2025-12-04T13:38:32.0236069Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0236267Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0236623Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0236736Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0236948Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0237121Z [rank1]:E1204 13:02:51.371000 384175 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0237161Z dist init r=1, world=4 2025-12-04T13:38:32.0237495Z [rank0]:[W1204 13:02:51.636590405 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0237538Z FAILED [38.9577s] [100%] 2025-12-04T13:38:32.0237541Z 2025-12-04T13:38:32.0237597Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0237700Z ___ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda ___ 2025-12-04T13:38:32.0237747Z Traceback (most recent call last): 2025-12-04T13:38:32.0237923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0237967Z self._join_processes(fn) 2025-12-04T13:38:32.0238140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0238198Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0238377Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0238422Z raise RuntimeError(error) 2025-12-04T13:38:32.0238512Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0238560Z Traceback (most recent call last): 2025-12-04T13:38:32.0238721Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0238765Z getattr(self, test_name)() 2025-12-04T13:38:32.0238923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0238961Z fn() 2025-12-04T13:38:32.0239112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0239154Z method(*args, **kwargs) 2025-12-04T13:38:32.0239427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0239470Z method(*args, **kwargs) 2025-12-04T13:38:32.0239646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0239705Z with policy(): 2025-12-04T13:38:32.0239858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0239902Z raise RuntimeError(msg) 2025-12-04T13:38:32.0240253Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 3. CUDA driver allocated memory was 2250244096 and is now 3787456512. 2025-12-04T13:38:32.0240258Z 2025-12-04T13:38:32.0240333Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0240568Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0240571Z 2025-12-04T13:38:32.0240658Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0240661Z 2025-12-04T13:38:32.0240663Z 2025-12-04T13:38:32.0240741Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0240828Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0241074Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0905abac027446cf.xml - 2025-12-04T13:38:32.0241135Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0241387Z FAILED [38.9577s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0241433Z Traceback (most recent call last): 2025-12-04T13:38:32.0241599Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0241644Z getattr(self, test_name)() 2025-12-04T13:38:32.0241806Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0241842Z fn() 2025-12-04T13:38:32.0242013Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0242055Z method(*args, **kwargs) 2025-12-04T13:38:32.0242208Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0242249Z method(*args, **kwargs) 2025-12-04T13:38:32.0242400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0242438Z with policy(): 2025-12-04T13:38:32.0242604Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0242647Z raise RuntimeError(msg) 2025-12-04T13:38:32.0242995Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 3. CUDA driver allocated memory was 2250244096 and is now 3787456512. 2025-12-04T13:38:32.0243000Z 2025-12-04T13:38:32.0243076Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0243306Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0243308Z 2025-12-04T13:38:32.0243395Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0243459Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0243538Z ====================== 1 failed, 32 deselected in 39.12s ======================= 2025-12-04T13:38:32.0243576Z Got exit code 1 2025-12-04T13:38:32.0243758Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda 2025-12-04T13:38:32.0243888Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0244077Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7d6773e1cbecc3a5.xml 2025-12-04T13:38:32.0244137Z ============================= test session starts ============================== 2025-12-04T13:38:32.0244248Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0244294Z cachedir: .pytest_cache 2025-12-04T13:38:32.0244452Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0244501Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0244541Z configfile: pytest.ini 2025-12-04T13:38:32.0244705Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0244780Z collecting ... collected 60 items / 6 deselected / 54 selected 2025-12-04T13:38:32.0244844Z stepcurrent: skipping 6 already run items. 2025-12-04T13:38:32.0244887Z Running 27 items in this shard 2025-12-04T13:38:32.0244889Z 2025-12-04T13:38:32.0245206Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:02:55.729000 384507 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 384576 2025-12-04T13:38:32.0245363Z I1204 13:02:55.730000 384507 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 384577 2025-12-04T13:38:32.0245521Z I1204 13:02:55.730000 384507 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 384578 2025-12-04T13:38:32.0245673Z I1204 13:02:55.731000 384507 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 384579 2025-12-04T13:38:32.0246267Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0246307Z _warn_cpu_init() 2025-12-04T13:38:32.0246881Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0246931Z _warn_cpu_init() 2025-12-04T13:38:32.0247501Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0247538Z _warn_cpu_init() 2025-12-04T13:38:32.0248119Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0248169Z _warn_cpu_init() 2025-12-04T13:38:32.0248461Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0248503Z return func(*args, **kwargs) 2025-12-04T13:38:32.0248646Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0248810Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0249101Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0249271Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0249555Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0249720Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0249999Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0250152Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0250445Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0250592Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0250871Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0251019Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0251301Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0251450Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0251938Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0252056Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0252265Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0252631Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0252743Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0252958Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0253122Z [rank0]:E1204 13:03:33.298000 384576 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0253165Z dist init r=0, world=4 2025-12-04T13:38:32.0253301Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0253463Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0253766Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0253919Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0254205Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0254330Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0254621Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0254768Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0255046Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0255193Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0255483Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0255622Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0255900Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0256049Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0256526Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0256654Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0256850Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0257212Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0257324Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0257536Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0257703Z [rank3]:E1204 13:03:33.300000 384579 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0257857Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0258020Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0258308Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0258461Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0258759Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0258882Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0259161Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0259311Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0259618Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0259779Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0260056Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0260192Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0260468Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0260620Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0261112Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0261228Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0261424Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0261786Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0261902Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0262127Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0262291Z [rank2]:E1204 13:03:33.300000 384578 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0262329Z dist init r=3, world=4 2025-12-04T13:38:32.0262370Z dist init r=2, world=4 2025-12-04T13:38:32.0262506Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0262668Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0262958Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0263128Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0263414Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0263535Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0263812Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0263972Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0264250Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0264396Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0264672Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0264811Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0265098Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0265248Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0265726Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0265841Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0266040Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0266413Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0266528Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0266737Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0266902Z [rank1]:E1204 13:03:33.309000 384577 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0266942Z dist init r=1, world=4 2025-12-04T13:38:32.0267287Z [rank0]:[W1204 13:03:33.560839773 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0267327Z FAILED [39.5588s] [ 3%] 2025-12-04T13:38:32.0267329Z 2025-12-04T13:38:32.0267387Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0267491Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.0267540Z Traceback (most recent call last): 2025-12-04T13:38:32.0267704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0267747Z self._join_processes(fn) 2025-12-04T13:38:32.0267932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0267985Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0268165Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0268209Z raise RuntimeError(error) 2025-12-04T13:38:32.0268292Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0268337Z Traceback (most recent call last): 2025-12-04T13:38:32.0268501Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0268543Z getattr(self, test_name)() 2025-12-04T13:38:32.0268703Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0268737Z fn() 2025-12-04T13:38:32.0268893Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0268945Z method(*args, **kwargs) 2025-12-04T13:38:32.0269098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0269139Z method(*args, **kwargs) 2025-12-04T13:38:32.0269293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0269331Z with policy(): 2025-12-04T13:38:32.0269483Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0269525Z raise RuntimeError(msg) 2025-12-04T13:38:32.0269906Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0269910Z 2025-12-04T13:38:32.0269986Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0270222Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0270225Z 2025-12-04T13:38:32.0270326Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0270329Z 2025-12-04T13:38:32.0270330Z 2025-12-04T13:38:32.0270406Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0270496Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0270731Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7d6773e1cbecc3a5.xml - 2025-12-04T13:38:32.0270796Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0271052Z FAILED [39.5588s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0271112Z Traceback (most recent call last): 2025-12-04T13:38:32.0271279Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0271321Z getattr(self, test_name)() 2025-12-04T13:38:32.0271483Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0271517Z fn() 2025-12-04T13:38:32.0271671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0271723Z method(*args, **kwargs) 2025-12-04T13:38:32.0271878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0271917Z method(*args, **kwargs) 2025-12-04T13:38:32.0272072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0272108Z with policy(): 2025-12-04T13:38:32.0272262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0272301Z raise RuntimeError(msg) 2025-12-04T13:38:32.0272658Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0272661Z 2025-12-04T13:38:32.0272734Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0272993Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0272995Z 2025-12-04T13:38:32.0273083Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0273147Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0273211Z ======================= 1 failed, 6 deselected in 39.72s ======================= 2025-12-04T13:38:32.0273247Z Got exit code 1 2025-12-04T13:38:32.0273289Z Retrying single test... 2025-12-04T13:38:32.0273479Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-159aa8b84e6f8c02.xml 2025-12-04T13:38:32.0273538Z ============================= test session starts ============================== 2025-12-04T13:38:32.0273651Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0273693Z cachedir: .pytest_cache 2025-12-04T13:38:32.0273852Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0273900Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0273940Z configfile: pytest.ini 2025-12-04T13:38:32.0274115Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0274189Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0274419Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0274462Z Running 1 items in this shard 2025-12-04T13:38:32.0274465Z 2025-12-04T13:38:32.0274780Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:03:37.835000 384909 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 384978 2025-12-04T13:38:32.0274946Z I1204 13:03:37.836000 384909 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 384979 2025-12-04T13:38:32.0275102Z I1204 13:03:37.836000 384909 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 384980 2025-12-04T13:38:32.0275254Z I1204 13:03:37.837000 384909 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 384981 2025-12-04T13:38:32.0275835Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0275888Z _warn_cpu_init() 2025-12-04T13:38:32.0276453Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0276492Z _warn_cpu_init() 2025-12-04T13:38:32.0277058Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0277106Z _warn_cpu_init() 2025-12-04T13:38:32.0277668Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0277704Z _warn_cpu_init() 2025-12-04T13:38:32.0278000Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0278046Z return func(*args, **kwargs) 2025-12-04T13:38:32.0278187Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0278360Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0278651Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0278807Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0279093Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0279221Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0279511Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0279689Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0280037Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0280201Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0280483Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0280619Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0280900Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0281047Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0281533Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0281665Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0281861Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0282225Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0282339Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0282554Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0282720Z [rank3]:E1204 13:04:15.502000 384981 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0282772Z dist init r=3, world=4 2025-12-04T13:38:32.0282909Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0283070Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0283359Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0283514Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0283813Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0283938Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0284215Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0284361Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0284650Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0284799Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0285075Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0285212Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0285489Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0285661Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0286142Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0286257Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0286453Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0286815Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0286931Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0287152Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0287318Z [rank1]:E1204 13:04:15.556000 384979 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0287356Z dist init r=1, world=4 2025-12-04T13:38:32.0287494Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0287652Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0287951Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0288108Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0288392Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0288516Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0288810Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0288959Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0289238Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0289388Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0289708Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0289857Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0290136Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0290284Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0290763Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0290878Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0291075Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0291449Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0291561Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0291775Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0291939Z [rank0]:E1204 13:04:15.567000 384978 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0291980Z dist init r=0, world=4 2025-12-04T13:38:32.0292115Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0292290Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0292578Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0292730Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0293016Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0293152Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0293432Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0293578Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0293854Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0294002Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0294293Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0294430Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0294706Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0294854Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0295331Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0295449Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0295655Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0296014Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0296127Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0296337Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0296512Z [rank2]:E1204 13:04:15.570000 384980 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0296553Z dist init r=2, world=4 2025-12-04T13:38:32.0296897Z [rank0]:[W1204 13:04:15.841703378 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0296937Z FAILED [39.5600s] [100%] 2025-12-04T13:38:32.0296939Z 2025-12-04T13:38:32.0297000Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0297116Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.0297164Z Traceback (most recent call last): 2025-12-04T13:38:32.0297330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0297375Z self._join_processes(fn) 2025-12-04T13:38:32.0297554Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0297608Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0297790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0297834Z raise RuntimeError(error) 2025-12-04T13:38:32.0297918Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0297964Z Traceback (most recent call last): 2025-12-04T13:38:32.0298129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0298183Z getattr(self, test_name)() 2025-12-04T13:38:32.0298345Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0298380Z fn() 2025-12-04T13:38:32.0298538Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0298580Z method(*args, **kwargs) 2025-12-04T13:38:32.0298735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0298776Z method(*args, **kwargs) 2025-12-04T13:38:32.0298931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0298969Z with policy(): 2025-12-04T13:38:32.0299127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0299169Z raise RuntimeError(msg) 2025-12-04T13:38:32.0299525Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0299538Z 2025-12-04T13:38:32.0299817Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0300052Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0300055Z 2025-12-04T13:38:32.0300145Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0300148Z 2025-12-04T13:38:32.0300150Z 2025-12-04T13:38:32.0300226Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0300316Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0300564Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-159aa8b84e6f8c02.xml - 2025-12-04T13:38:32.0300631Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0300884Z FAILED [39.5600s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0300934Z Traceback (most recent call last): 2025-12-04T13:38:32.0301101Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0301158Z getattr(self, test_name)() 2025-12-04T13:38:32.0301323Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0301358Z fn() 2025-12-04T13:38:32.0301515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0301557Z method(*args, **kwargs) 2025-12-04T13:38:32.0301713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0301756Z method(*args, **kwargs) 2025-12-04T13:38:32.0301910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0301948Z with policy(): 2025-12-04T13:38:32.0302104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0302146Z raise RuntimeError(msg) 2025-12-04T13:38:32.0302519Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0302522Z 2025-12-04T13:38:32.0302596Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0302836Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0302839Z 2025-12-04T13:38:32.0302928Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0302991Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0303057Z ====================== 1 failed, 32 deselected in 39.72s ======================= 2025-12-04T13:38:32.0303097Z Got exit code 1 2025-12-04T13:38:32.0303142Z Retrying single test... 2025-12-04T13:38:32.0303334Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2cc526eae4caa98d.xml 2025-12-04T13:38:32.0303397Z ============================= test session starts ============================== 2025-12-04T13:38:32.0303525Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0303570Z cachedir: .pytest_cache 2025-12-04T13:38:32.0303795Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0303845Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0303886Z configfile: pytest.ini 2025-12-04T13:38:32.0304052Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0304127Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0304360Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0304404Z Running 1 items in this shard 2025-12-04T13:38:32.0304426Z 2025-12-04T13:38:32.0304746Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:04:20.037000 385311 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 385380 2025-12-04T13:38:32.0304908Z I1204 13:04:20.037000 385311 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 385381 2025-12-04T13:38:32.0305059Z I1204 13:04:20.038000 385311 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 385382 2025-12-04T13:38:32.0305227Z I1204 13:04:20.038000 385311 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 385383 2025-12-04T13:38:32.0305811Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0305853Z _warn_cpu_init() 2025-12-04T13:38:32.0306422Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0306479Z _warn_cpu_init() 2025-12-04T13:38:32.0307052Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0307089Z _warn_cpu_init() 2025-12-04T13:38:32.0307652Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0307691Z _warn_cpu_init() 2025-12-04T13:38:32.0307998Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0308045Z return func(*args, **kwargs) 2025-12-04T13:38:32.0308188Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0308353Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0308644Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0308804Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0309101Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0309230Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0309507Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0309698Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0309995Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0310142Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0310422Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0310558Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0310838Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0310998Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0311487Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0311609Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0311803Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0312172Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0312286Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0312515Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0312682Z [rank0]:E1204 13:04:57.482000 385380 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0312724Z dist init r=0, world=4 2025-12-04T13:38:32.0312864Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0313024Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0313327Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0313481Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0313768Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0313892Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0314182Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0314331Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0314609Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0314759Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0315034Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0315183Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0315463Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0315615Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0316097Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0316217Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0316416Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0316790Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0316905Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0317117Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0317286Z [rank2]:E1204 13:04:57.484000 385382 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0317326Z dist init r=2, world=4 2025-12-04T13:38:32.0317467Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0317641Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0317927Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0318082Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0318366Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0318504Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0318784Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0318934Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0319209Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0319358Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0319686Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0319822Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0320101Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0320250Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0320732Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0320851Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0321065Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0321433Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0321546Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0321760Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0321938Z [rank3]:E1204 13:04:57.522000 385383 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0321984Z dist init r=3, world=4 2025-12-04T13:38:32.0322121Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0322283Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0322574Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0322741Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0323031Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0323155Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0323437Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0323584Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0323879Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0324031Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0324308Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0324446Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0324727Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0324881Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0325371Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0325489Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0325687Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0326049Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0326175Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0326386Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0326553Z [rank1]:E1204 13:04:57.536000 385381 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0326592Z dist init r=1, world=4 2025-12-04T13:38:32.0326930Z [rank0]:[W1204 13:04:57.668643805 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0326984Z FAILED [39.2612s] [100%] 2025-12-04T13:38:32.0326989Z 2025-12-04T13:38:32.0327046Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0327154Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.0327202Z Traceback (most recent call last): 2025-12-04T13:38:32.0327369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0327412Z self._join_processes(fn) 2025-12-04T13:38:32.0327589Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0327645Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0327830Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0327886Z raise RuntimeError(error) 2025-12-04T13:38:32.0327972Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0328017Z Traceback (most recent call last): 2025-12-04T13:38:32.0328184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0328227Z getattr(self, test_name)() 2025-12-04T13:38:32.0328388Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0328423Z fn() 2025-12-04T13:38:32.0328577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0328619Z method(*args, **kwargs) 2025-12-04T13:38:32.0328775Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0328817Z method(*args, **kwargs) 2025-12-04T13:38:32.0328973Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0329010Z with policy(): 2025-12-04T13:38:32.0329177Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0329219Z raise RuntimeError(msg) 2025-12-04T13:38:32.0329619Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0329621Z 2025-12-04T13:38:32.0329701Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0329938Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0329942Z 2025-12-04T13:38:32.0330033Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0330035Z 2025-12-04T13:38:32.0330051Z 2025-12-04T13:38:32.0330127Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0330218Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0330451Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2cc526eae4caa98d.xml - 2025-12-04T13:38:32.0330515Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0330764Z FAILED [39.2612s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0330826Z Traceback (most recent call last): 2025-12-04T13:38:32.0330994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0331037Z getattr(self, test_name)() 2025-12-04T13:38:32.0331202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0331237Z fn() 2025-12-04T13:38:32.0331393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0331434Z method(*args, **kwargs) 2025-12-04T13:38:32.0331588Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0331630Z method(*args, **kwargs) 2025-12-04T13:38:32.0331800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0331837Z with policy(): 2025-12-04T13:38:32.0331995Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0332037Z raise RuntimeError(msg) 2025-12-04T13:38:32.0332399Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0332401Z 2025-12-04T13:38:32.0332476Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0332714Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0332718Z 2025-12-04T13:38:32.0332808Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0332871Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0332939Z ====================== 1 failed, 32 deselected in 39.43s ======================= 2025-12-04T13:38:32.0332976Z Got exit code 1 2025-12-04T13:38:32.0333178Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0333307Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0333497Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-75e1e2e1e2b3ee4f.xml 2025-12-04T13:38:32.0333556Z ============================= test session starts ============================== 2025-12-04T13:38:32.0333674Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0333716Z cachedir: .pytest_cache 2025-12-04T13:38:32.0333878Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0333934Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0333980Z configfile: pytest.ini 2025-12-04T13:38:32.0334145Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0334222Z collecting ... collected 60 items / 7 deselected / 53 selected 2025-12-04T13:38:32.0334275Z stepcurrent: skipping 7 already run items. 2025-12-04T13:38:32.0334323Z Running 26 items in this shard 2025-12-04T13:38:32.0334325Z 2025-12-04T13:38:32.0334641Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda I1204 13:05:01.959000 385713 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 385782 2025-12-04T13:38:32.0334811Z I1204 13:05:01.960000 385713 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 385783 2025-12-04T13:38:32.0334968Z I1204 13:05:01.960000 385713 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 385784 2025-12-04T13:38:32.0335120Z I1204 13:05:01.960000 385713 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 385785 2025-12-04T13:38:32.0335419Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0335471Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0336048Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0336106Z _warn_cpu_init() 2025-12-04T13:38:32.0336402Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0336486Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0336772Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0336829Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0337112Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0337164Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0337749Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0337792Z _warn_cpu_init() 2025-12-04T13:38:32.0338379Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0338419Z _warn_cpu_init() 2025-12-04T13:38:32.0338712Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0338761Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0339339Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0339390Z _warn_cpu_init() 2025-12-04T13:38:32.0339722Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0339804Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0340093Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0340173Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0340474Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0340552Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0341831Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0341974Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0342207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0342250Z return func(*args, **kwargs) 2025-12-04T13:38:32.0343521Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0343652Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0344910Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0345045Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0345285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0345328Z return func(*args, **kwargs) 2025-12-04T13:38:32.0345558Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0345603Z return func(*args, **kwargs) 2025-12-04T13:38:32.0346867Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0346992Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0347226Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0347268Z return func(*args, **kwargs) 2025-12-04T13:38:32.0347495Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0347537Z return func(*args, **kwargs) 2025-12-04T13:38:32.0347770Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0347813Z return func(*args, **kwargs) 2025-12-04T13:38:32.0348036Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0348077Z return func(*args, **kwargs) 2025-12-04T13:38:32.0348299Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0348350Z return func(*args, **kwargs) 2025-12-04T13:38:32.0348646Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0348688Z return func(*args, **kwargs) 2025-12-04T13:38:32.0348836Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0349003Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0349295Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0349455Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0349794Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0349925Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0350206Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0350360Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0350638Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0350793Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0351090Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0351229Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0351512Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0351661Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0352172Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 160256 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.0352292Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0352488Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0352855Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0352988Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0353206Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0353372Z [rank1]:E1204 13:05:09.492000 385783 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0353415Z dist init r=1, world=4 2025-12-04T13:38:32.0353552Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0353715Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0354006Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0354178Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0354469Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0354592Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0354874Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0355023Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0355301Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0355461Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0355736Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0355875Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0356156Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0356319Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0356807Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 162304 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T13:38:32.0356924Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0370140Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0370557Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0370686Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0370909Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0371085Z [rank0]:E1204 13:05:09.539000 385782 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0371131Z dist init r=0, world=4 2025-12-04T13:38:32.0371280Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0371509Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0371813Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0371973Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0372267Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0372398Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0372680Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0372850Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0373127Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0373279Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0373558Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0373699Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0374001Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0374150Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0374645Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T13:38:32.0374780Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0374983Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0375353Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0375468Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0375684Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0375862Z [rank2]:E1204 13:05:09.543000 385784 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0375907Z dist init r=2, world=4 2025-12-04T13:38:32.0376046Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0376209Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0376498Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0376655Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0376951Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0377078Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0377370Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0377520Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0377803Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0377959Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0378254Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0378394Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0378672Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0378823Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0379325Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.0379445Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0379691Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0380060Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0380195Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0380407Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0380575Z [rank3]:E1204 13:05:09.546000 385785 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0380616Z dist init r=3, world=4 2025-12-04T13:38:32.0380962Z [rank0]:[W1204 13:05:09.790562524 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0381003Z FAILED [9.5196s] [ 3%] 2025-12-04T13:38:32.0381008Z 2025-12-04T13:38:32.0381071Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0381178Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.0381230Z Traceback (most recent call last): 2025-12-04T13:38:32.0381399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0381462Z self._join_processes(fn) 2025-12-04T13:38:32.0381637Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0381696Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0381878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0381925Z raise RuntimeError(error) 2025-12-04T13:38:32.0382011Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0382059Z Traceback (most recent call last): 2025-12-04T13:38:32.0382224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0382268Z getattr(self, test_name)() 2025-12-04T13:38:32.0382444Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0382480Z fn() 2025-12-04T13:38:32.0382637Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0382679Z method(*args, **kwargs) 2025-12-04T13:38:32.0382835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0382876Z method(*args, **kwargs) 2025-12-04T13:38:32.0383045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0383084Z with policy(): 2025-12-04T13:38:32.0383242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0383284Z raise RuntimeError(msg) 2025-12-04T13:38:32.0383649Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 160256 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.0383652Z 2025-12-04T13:38:32.0383729Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0383973Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0383977Z 2025-12-04T13:38:32.0384078Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0384080Z 2025-12-04T13:38:32.0384082Z 2025-12-04T13:38:32.0384163Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0384254Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0384489Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-75e1e2e1e2b3ee4f.xml - 2025-12-04T13:38:32.0384553Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0384814Z FAILED [9.5196s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0384865Z Traceback (most recent call last): 2025-12-04T13:38:32.0385034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0385080Z getattr(self, test_name)() 2025-12-04T13:38:32.0385242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0385283Z fn() 2025-12-04T13:38:32.0406043Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0406102Z method(*args, **kwargs) 2025-12-04T13:38:32.0406261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0406301Z method(*args, **kwargs) 2025-12-04T13:38:32.0406457Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0406496Z with policy(): 2025-12-04T13:38:32.0406650Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0406690Z raise RuntimeError(msg) 2025-12-04T13:38:32.0407076Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 160256 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.0407079Z 2025-12-04T13:38:32.0407155Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0407393Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0407396Z 2025-12-04T13:38:32.0407482Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0407562Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0407630Z ======================= 1 failed, 7 deselected in 9.66s ======================== 2025-12-04T13:38:32.0407666Z Got exit code 1 2025-12-04T13:38:32.0407706Z Retrying single test... 2025-12-04T13:38:32.0407897Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c2bb62c1be351938.xml 2025-12-04T13:38:32.0407957Z ============================= test session starts ============================== 2025-12-04T13:38:32.0408071Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0408111Z cachedir: .pytest_cache 2025-12-04T13:38:32.0408269Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0408316Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0408356Z configfile: pytest.ini 2025-12-04T13:38:32.0408541Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0408615Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0408846Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0408890Z Running 1 items in this shard 2025-12-04T13:38:32.0408892Z 2025-12-04T13:38:32.0409206Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda I1204 13:05:14.133000 386115 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 386184 2025-12-04T13:38:32.0409360Z I1204 13:05:14.133000 386115 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 386185 2025-12-04T13:38:32.0409511Z I1204 13:05:14.134000 386115 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 386186 2025-12-04T13:38:32.0409705Z I1204 13:05:14.134000 386115 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 386187 2025-12-04T13:38:32.0410014Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0410066Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0410643Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0410683Z _warn_cpu_init() 2025-12-04T13:38:32.0410985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0411035Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0411600Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0411650Z _warn_cpu_init() 2025-12-04T13:38:32.0411937Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0412017Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0412305Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0412379Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0412667Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0412714Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0413287Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0413338Z _warn_cpu_init() 2025-12-04T13:38:32.0413621Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0413668Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0414234Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0414272Z _warn_cpu_init() 2025-12-04T13:38:32.0414569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0414642Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0414927Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0414999Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0416277Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0416422Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0416650Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0416694Z return func(*args, **kwargs) 2025-12-04T13:38:32.0417947Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0418082Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0418309Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0418350Z return func(*args, **kwargs) 2025-12-04T13:38:32.0419656Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0419778Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0420006Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0420046Z return func(*args, **kwargs) 2025-12-04T13:38:32.0421304Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0421436Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0421662Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0421702Z return func(*args, **kwargs) 2025-12-04T13:38:32.0421922Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0421962Z return func(*args, **kwargs) 2025-12-04T13:38:32.0422181Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0422235Z return func(*args, **kwargs) 2025-12-04T13:38:32.0422457Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0422497Z return func(*args, **kwargs) 2025-12-04T13:38:32.0422716Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0422756Z return func(*args, **kwargs) 2025-12-04T13:38:32.0423051Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0423092Z return func(*args, **kwargs) 2025-12-04T13:38:32.0423239Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0423403Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0423715Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0423871Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0424155Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0424280Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0424567Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0424718Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0424998Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0425146Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0425436Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0425576Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0425858Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0426009Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0426499Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.0426629Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0426829Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0427194Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0427311Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0427525Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0427699Z [rank3]:E1204 13:05:21.719000 386187 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0427738Z dist init r=3, world=4 2025-12-04T13:38:32.0427890Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0428051Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0428342Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0428497Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0428783Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0428919Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0429195Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0429345Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0429654Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0429820Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0430097Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0430233Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0430514Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0430661Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0431164Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T13:38:32.0431277Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0431477Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0431842Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0431956Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0432177Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0432355Z [rank0]:E1204 13:05:21.771000 386184 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0432396Z dist init r=0, world=4 2025-12-04T13:38:32.0432533Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0432694Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0432984Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0433154Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0433441Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0433564Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0433842Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0434001Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0434278Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0434426Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0434703Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0434841Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0435124Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0435287Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0435771Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.0435887Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0436082Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0436447Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0436572Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0436782Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0436947Z [rank1]:E1204 13:05:21.794000 386185 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0436986Z dist init r=1, world=4 2025-12-04T13:38:32.0437125Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0437285Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0437591Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0437744Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0438030Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0438166Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0438445Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0438594Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0438872Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0439021Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0439295Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0439450Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0439766Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0439915Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0440407Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T13:38:32.0440523Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0440721Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0441098Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0441212Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0441423Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0441590Z [rank2]:E1204 13:05:21.802000 386186 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0441631Z dist init r=2, world=4 2025-12-04T13:38:32.0441983Z [rank0]:[W1204 13:05:22.019907292 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0442026Z FAILED [9.5210s] [100%] 2025-12-04T13:38:32.0442029Z 2025-12-04T13:38:32.0442087Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0442196Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.0442245Z Traceback (most recent call last): 2025-12-04T13:38:32.0442425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0442470Z self._join_processes(fn) 2025-12-04T13:38:32.0442646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0442702Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0442883Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0442929Z raise RuntimeError(error) 2025-12-04T13:38:32.0443010Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0443058Z Traceback (most recent call last): 2025-12-04T13:38:32.0443220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0443268Z getattr(self, test_name)() 2025-12-04T13:38:32.0443439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0443477Z fn() 2025-12-04T13:38:32.0443630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0443675Z method(*args, **kwargs) 2025-12-04T13:38:32.0443828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0443872Z method(*args, **kwargs) 2025-12-04T13:38:32.0444024Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0444064Z with policy(): 2025-12-04T13:38:32.0444217Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0444262Z raise RuntimeError(msg) 2025-12-04T13:38:32.0444621Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.0444625Z 2025-12-04T13:38:32.0444703Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0444953Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0444958Z 2025-12-04T13:38:32.0445046Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0445048Z 2025-12-04T13:38:32.0445050Z 2025-12-04T13:38:32.0445133Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0445222Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0445461Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c2bb62c1be351938.xml - 2025-12-04T13:38:32.0445521Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0445789Z FAILED [9.5210s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0445835Z Traceback (most recent call last): 2025-12-04T13:38:32.0446002Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0446043Z getattr(self, test_name)() 2025-12-04T13:38:32.0446207Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0446253Z fn() 2025-12-04T13:38:32.0446407Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0446446Z method(*args, **kwargs) 2025-12-04T13:38:32.0446601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0446643Z method(*args, **kwargs) 2025-12-04T13:38:32.0446797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0446836Z with policy(): 2025-12-04T13:38:32.0446989Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0447033Z raise RuntimeError(msg) 2025-12-04T13:38:32.0447402Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.0447416Z 2025-12-04T13:38:32.0447493Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0447732Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0447734Z 2025-12-04T13:38:32.0447823Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0447886Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0447951Z ======================= 1 failed, 32 deselected in 9.67s ======================= 2025-12-04T13:38:32.0447989Z Got exit code 1 2025-12-04T13:38:32.0448033Z Retrying single test... 2025-12-04T13:38:32.0448223Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0435229ba06dfe2d.xml 2025-12-04T13:38:32.0448285Z ============================= test session starts ============================== 2025-12-04T13:38:32.0448401Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0448443Z cachedir: .pytest_cache 2025-12-04T13:38:32.0448615Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0448663Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0448705Z configfile: pytest.ini 2025-12-04T13:38:32.0448870Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0448947Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0449173Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0449222Z Running 1 items in this shard 2025-12-04T13:38:32.0449224Z 2025-12-04T13:38:32.0449554Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda I1204 13:05:26.331000 386517 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 386586 2025-12-04T13:38:32.0449761Z I1204 13:05:26.332000 386517 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 386587 2025-12-04T13:38:32.0449914Z I1204 13:05:26.333000 386517 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 386588 2025-12-04T13:38:32.0450066Z I1204 13:05:26.333000 386517 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 386589 2025-12-04T13:38:32.0450377Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0450429Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0450721Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0450771Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0451358Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0451411Z _warn_cpu_init() 2025-12-04T13:38:32.0451984Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0452024Z _warn_cpu_init() 2025-12-04T13:38:32.0452314Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0452394Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0452682Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0452760Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0453059Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0453112Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0453681Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0453722Z _warn_cpu_init() 2025-12-04T13:38:32.0454025Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0454074Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0454648Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0454704Z _warn_cpu_init() 2025-12-04T13:38:32.0454995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0455071Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0455362Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0455440Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0456707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0456864Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0457096Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0457142Z return func(*args, **kwargs) 2025-12-04T13:38:32.0458411Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0458537Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0458772Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0458820Z return func(*args, **kwargs) 2025-12-04T13:38:32.0460116Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0460253Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0460480Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0460522Z return func(*args, **kwargs) 2025-12-04T13:38:32.0461771Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0461909Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0462143Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0462187Z return func(*args, **kwargs) 2025-12-04T13:38:32.0462423Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0462467Z return func(*args, **kwargs) 2025-12-04T13:38:32.0462688Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0462731Z return func(*args, **kwargs) 2025-12-04T13:38:32.0462953Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0462994Z return func(*args, **kwargs) 2025-12-04T13:38:32.0463216Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0463271Z return func(*args, **kwargs) 2025-12-04T13:38:32.0463567Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0463607Z return func(*args, **kwargs) 2025-12-04T13:38:32.0463753Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0463915Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0464222Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0464380Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0464669Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0464796Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0465073Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0465239Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0465516Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0465666Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0465941Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0466081Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0466369Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0466521Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0467022Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.0467138Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0467340Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0467722Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0467838Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0468052Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0468217Z [rank3]:E1204 13:05:33.796000 386589 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0468270Z dist init r=3, world=4 2025-12-04T13:38:32.0468410Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0468571Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0468861Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0469017Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0469302Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0469438Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0469762Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0469913Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0470192Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0470337Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0470617Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0470754Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0471052Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0471203Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0471688Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T13:38:32.0471813Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0472029Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0472395Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0472511Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0472725Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0472907Z [rank0]:E1204 13:05:33.846000 386586 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0472946Z dist init r=0, world=4 2025-12-04T13:38:32.0473088Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0473247Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0473537Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0473691Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0473994Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0474123Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0474401Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0474552Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0474830Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0474983Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0475271Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0475410Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0475687Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0475839Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0476338Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T13:38:32.0476452Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0476652Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0477013Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0477140Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0477353Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0477518Z [rank2]:E1204 13:05:33.848000 386588 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0477559Z dist init r=2, world=4 2025-12-04T13:38:32.0477696Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0477858Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0478146Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0478315Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0478602Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0478728Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0479009Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0479157Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0479436Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0479629Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0479911Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0480048Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0480330Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0480495Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0480976Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 156160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.0481093Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0481301Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0481667Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0481781Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0481995Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0482163Z [rank1]:E1204 13:05:33.859000 386587 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0482203Z dist init r=1, world=4 2025-12-04T13:38:32.0482543Z [rank0]:[W1204 13:05:34.113669637 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0482596Z FAILED [9.4192s] [100%] 2025-12-04T13:38:32.0482599Z 2025-12-04T13:38:32.0482660Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0482765Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.0482816Z Traceback (most recent call last): 2025-12-04T13:38:32.0482979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0483028Z self._join_processes(fn) 2025-12-04T13:38:32.0483201Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0483261Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0483440Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0483488Z raise RuntimeError(error) 2025-12-04T13:38:32.0483569Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0483618Z Traceback (most recent call last): 2025-12-04T13:38:32.0483792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0483839Z getattr(self, test_name)() 2025-12-04T13:38:32.0483999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0484037Z fn() 2025-12-04T13:38:32.0484195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0484238Z method(*args, **kwargs) 2025-12-04T13:38:32.0484395Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0484438Z method(*args, **kwargs) 2025-12-04T13:38:32.0484602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0484642Z with policy(): 2025-12-04T13:38:32.0484798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0484840Z raise RuntimeError(msg) 2025-12-04T13:38:32.0485203Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.0485217Z 2025-12-04T13:38:32.0485293Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0485533Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0485536Z 2025-12-04T13:38:32.0485624Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0485630Z 2025-12-04T13:38:32.0485632Z 2025-12-04T13:38:32.0485708Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0485801Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0486037Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0435229ba06dfe2d.xml - 2025-12-04T13:38:32.0486104Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0486371Z FAILED [9.4192s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0486422Z Traceback (most recent call last): 2025-12-04T13:38:32.0486586Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0486633Z getattr(self, test_name)() 2025-12-04T13:38:32.0486794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0486832Z fn() 2025-12-04T13:38:32.0486984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0487028Z method(*args, **kwargs) 2025-12-04T13:38:32.0487179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0487224Z method(*args, **kwargs) 2025-12-04T13:38:32.0487375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0487416Z with policy(): 2025-12-04T13:38:32.0487598Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0487645Z raise RuntimeError(msg) 2025-12-04T13:38:32.0488011Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.0488013Z 2025-12-04T13:38:32.0488089Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0488328Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0488331Z 2025-12-04T13:38:32.0488417Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0488495Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0488559Z ======================= 1 failed, 32 deselected in 9.58s ======================= 2025-12-04T13:38:32.0488600Z Got exit code 1 2025-12-04T13:38:32.0488786Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda 2025-12-04T13:38:32.0488917Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0489104Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3a97c4aa2fa32d8f.xml 2025-12-04T13:38:32.0489182Z ============================= test session starts ============================== 2025-12-04T13:38:32.0489299Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0489342Z cachedir: .pytest_cache 2025-12-04T13:38:32.0489506Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0489554Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0489629Z configfile: pytest.ini 2025-12-04T13:38:32.0489794Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0489871Z collecting ... collected 60 items / 8 deselected / 52 selected 2025-12-04T13:38:32.0489925Z stepcurrent: skipping 8 already run items. 2025-12-04T13:38:32.0489973Z Running 25 items in this shard 2025-12-04T13:38:32.0489976Z 2025-12-04T13:38:32.0490302Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda I1204 13:05:38.381000 386919 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 386988 2025-12-04T13:38:32.0490464Z I1204 13:05:38.382000 386919 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 386989 2025-12-04T13:38:32.0490616Z I1204 13:05:38.382000 386919 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 386990 2025-12-04T13:38:32.0490769Z I1204 13:05:38.383000 386919 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 386991 2025-12-04T13:38:32.0491346Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0491390Z _warn_cpu_init() 2025-12-04T13:38:32.0491971Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0492010Z _warn_cpu_init() 2025-12-04T13:38:32.0492577Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0492619Z _warn_cpu_init() 2025-12-04T13:38:32.0493203Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0493244Z _warn_cpu_init() 2025-12-04T13:38:32.0493537Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0493604Z return func(*args, **kwargs) 2025-12-04T13:38:32.0493747Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0493912Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0494200Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0494359Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0494646Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0494783Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0495065Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0495215Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0495497Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0495646Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0495928Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0496079Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0496357Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0496509Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0496987Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0497117Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0497315Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0497677Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0497807Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0498021Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0498191Z [rank0]:E1204 13:06:34.001000 386988 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0498232Z dist init r=0, world=4 2025-12-04T13:38:32.0498372Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0498535Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0498825Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0498991Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0499281Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0499410Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0499827Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0499980Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0500257Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0500407Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0500700Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0500843Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0501125Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0501278Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0501780Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0501895Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0502093Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0502464Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0502582Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0502798Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0502964Z [rank1]:E1204 13:06:34.018000 386989 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0503008Z dist init r=1, world=4 2025-12-04T13:38:32.0503145Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0503308Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0503617Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0503776Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0504061Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0504188Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0504467Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0504615Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0504906Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0505055Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0505337Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0505474Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0505764Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0505917Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0506395Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0506523Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0506718Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0507079Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0507194Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0507406Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0507576Z [rank3]:E1204 13:06:34.025000 386991 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0507627Z dist init r=3, world=4 2025-12-04T13:38:32.0507769Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0507929Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0508220Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0508376Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0508667Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0508793Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0509086Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0509237Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0509511Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0509714Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0509995Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0510149Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0510427Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0510580Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0511060Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0514162Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0514364Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0514722Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0514840Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0515082Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0515248Z [rank2]:E1204 13:06:34.101000 386990 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0515290Z dist init r=2, world=4 2025-12-04T13:38:32.0515634Z [rank0]:[W1204 13:06:34.164417214 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0515681Z FAILED [57.5829s] [ 4%] 2025-12-04T13:38:32.0515683Z 2025-12-04T13:38:32.0515740Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0515843Z __ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda ___ 2025-12-04T13:38:32.0515893Z Traceback (most recent call last): 2025-12-04T13:38:32.0516063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0516109Z self._join_processes(fn) 2025-12-04T13:38:32.0516289Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0516357Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0516542Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0516588Z raise RuntimeError(error) 2025-12-04T13:38:32.0516670Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0516716Z Traceback (most recent call last): 2025-12-04T13:38:32.0516880Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0516927Z getattr(self, test_name)() 2025-12-04T13:38:32.0517087Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0517126Z fn() 2025-12-04T13:38:32.0517290Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0517335Z method(*args, **kwargs) 2025-12-04T13:38:32.0517488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0517533Z method(*args, **kwargs) 2025-12-04T13:38:32.0517685Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0517727Z with policy(): 2025-12-04T13:38:32.0517891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0517938Z raise RuntimeError(msg) 2025-12-04T13:38:32.0518291Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0518295Z 2025-12-04T13:38:32.0518376Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0518611Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0518613Z 2025-12-04T13:38:32.0518702Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0518705Z 2025-12-04T13:38:32.0518768Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0518826Z Traceback (most recent call last): 2025-12-04T13:38:32.0518993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0519035Z getattr(self, test_name)() 2025-12-04T13:38:32.0519200Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0519238Z fn() 2025-12-04T13:38:32.0519393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0519433Z method(*args, **kwargs) 2025-12-04T13:38:32.0519629Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0519670Z method(*args, **kwargs) 2025-12-04T13:38:32.0519824Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0519865Z with policy(): 2025-12-04T13:38:32.0520022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0520063Z raise RuntimeError(msg) 2025-12-04T13:38:32.0520439Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0520441Z 2025-12-04T13:38:32.0520518Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0520753Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0520756Z 2025-12-04T13:38:32.0520847Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0520850Z 2025-12-04T13:38:32.0520852Z 2025-12-04T13:38:32.0520928Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0521019Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0521270Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3a97c4aa2fa32d8f.xml - 2025-12-04T13:38:32.0521337Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0521594Z FAILED [57.5829s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0521644Z Traceback (most recent call last): 2025-12-04T13:38:32.0521831Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0521878Z getattr(self, test_name)() 2025-12-04T13:38:32.0522038Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0522077Z fn() 2025-12-04T13:38:32.0522231Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0522275Z method(*args, **kwargs) 2025-12-04T13:38:32.0522427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0522472Z method(*args, **kwargs) 2025-12-04T13:38:32.0522626Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0522664Z with policy(): 2025-12-04T13:38:32.0522821Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0522892Z raise RuntimeError(msg) 2025-12-04T13:38:32.0523247Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0523250Z 2025-12-04T13:38:32.0523323Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0523556Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0523559Z 2025-12-04T13:38:32.0523645Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0523647Z 2025-12-04T13:38:32.0523709Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0523755Z Traceback (most recent call last): 2025-12-04T13:38:32.0523921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0523963Z getattr(self, test_name)() 2025-12-04T13:38:32.0524127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0524177Z fn() 2025-12-04T13:38:32.0524330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0524374Z method(*args, **kwargs) 2025-12-04T13:38:32.0524524Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0524568Z method(*args, **kwargs) 2025-12-04T13:38:32.0524719Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0524761Z with policy(): 2025-12-04T13:38:32.0524912Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0524957Z raise RuntimeError(msg) 2025-12-04T13:38:32.0525315Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0525317Z 2025-12-04T13:38:32.0525394Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0525625Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0525640Z 2025-12-04T13:38:32.0525730Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0525795Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0525862Z ======================= 1 failed, 8 deselected in 57.72s ======================= 2025-12-04T13:38:32.0525904Z Got exit code 1 2025-12-04T13:38:32.0525946Z Retrying single test... 2025-12-04T13:38:32.0526140Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4c3c53bf15bf011b.xml 2025-12-04T13:38:32.0526199Z ============================= test session starts ============================== 2025-12-04T13:38:32.0526319Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0526362Z cachedir: .pytest_cache 2025-12-04T13:38:32.0526523Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0526618Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0526662Z configfile: pytest.ini 2025-12-04T13:38:32.0526825Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0526904Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0527130Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0527176Z Running 1 items in this shard 2025-12-04T13:38:32.0527178Z 2025-12-04T13:38:32.0527490Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda I1204 13:06:38.614000 387321 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 387390 2025-12-04T13:38:32.0527651Z I1204 13:06:38.615000 387321 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 387391 2025-12-04T13:38:32.0527807Z I1204 13:06:38.615000 387321 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 387392 2025-12-04T13:38:32.0527963Z I1204 13:06:38.616000 387321 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 387393 2025-12-04T13:38:32.0528559Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0528599Z _warn_cpu_init() 2025-12-04T13:38:32.0529184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0529231Z _warn_cpu_init() 2025-12-04T13:38:32.0529889Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0529946Z _warn_cpu_init() 2025-12-04T13:38:32.0530518Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0530557Z _warn_cpu_init() 2025-12-04T13:38:32.0530850Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0530893Z return func(*args, **kwargs) 2025-12-04T13:38:32.0531035Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0531223Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0531518Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0531673Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0531960Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0532084Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0532363Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0532513Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0532809Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0532957Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0533232Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0533371Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0533663Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0533813Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0534294Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0534424Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0534621Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0534979Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0535096Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0535309Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0535475Z [rank2]:E1204 13:07:34.328000 387392 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0535525Z dist init r=2, world=4 2025-12-04T13:38:32.0535665Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0535825Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0536114Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0536269Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0536561Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0536689Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0536976Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0537125Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0537400Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0537549Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0537826Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0537978Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0538260Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0538409Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0538885Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0539013Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0539209Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0539566Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0539722Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0539954Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0540117Z [rank0]:E1204 13:07:34.371000 387390 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0540157Z dist init r=0, world=4 2025-12-04T13:38:32.0540294Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0540455Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0540745Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0540901Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0541186Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0541322Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0541600Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0541747Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0542025Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0542185Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0542462Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0542599Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0542875Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0543039Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0543516Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0543631Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0543827Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0544184Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0544309Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0544520Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0544685Z [rank1]:E1204 13:07:34.377000 387391 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0544722Z dist init r=1, world=4 2025-12-04T13:38:32.0544861Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0545021Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0545310Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0545476Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0545759Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0545883Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0546162Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0546312Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0546600Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0546749Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0547025Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0547171Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0547452Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0547601Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0548078Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0548190Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0548399Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0548760Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0548871Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0549084Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0549247Z [rank3]:E1204 13:07:34.380000 387393 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0549288Z dist init r=3, world=4 2025-12-04T13:38:32.0549663Z [rank0]:[W1204 13:07:34.621121381 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0549705Z FAILED [57.6789s] [100%] 2025-12-04T13:38:32.0549707Z 2025-12-04T13:38:32.0549777Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0549879Z __ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda ___ 2025-12-04T13:38:32.0549925Z Traceback (most recent call last): 2025-12-04T13:38:32.0550090Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0550135Z self._join_processes(fn) 2025-12-04T13:38:32.0550308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0550365Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0550558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0550603Z raise RuntimeError(error) 2025-12-04T13:38:32.0550684Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0550731Z Traceback (most recent call last): 2025-12-04T13:38:32.0550892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0550936Z getattr(self, test_name)() 2025-12-04T13:38:32.0551093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0551143Z fn() 2025-12-04T13:38:32.0551295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0551338Z method(*args, **kwargs) 2025-12-04T13:38:32.0551490Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0551533Z method(*args, **kwargs) 2025-12-04T13:38:32.0551686Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0551724Z with policy(): 2025-12-04T13:38:32.0551877Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0551920Z raise RuntimeError(msg) 2025-12-04T13:38:32.0552275Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0552293Z 2025-12-04T13:38:32.0552370Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0552605Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0552607Z 2025-12-04T13:38:32.0552695Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0552697Z 2025-12-04T13:38:32.0552699Z 2025-12-04T13:38:32.0552777Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0552865Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0553102Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4c3c53bf15bf011b.xml - 2025-12-04T13:38:32.0553163Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0553414Z FAILED [57.6789s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0553460Z Traceback (most recent call last): 2025-12-04T13:38:32.0553642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0553684Z getattr(self, test_name)() 2025-12-04T13:38:32.0553845Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0553882Z fn() 2025-12-04T13:38:32.0554033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0554076Z method(*args, **kwargs) 2025-12-04T13:38:32.0554227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0554268Z method(*args, **kwargs) 2025-12-04T13:38:32.0554431Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0554471Z with policy(): 2025-12-04T13:38:32.0554625Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0554667Z raise RuntimeError(msg) 2025-12-04T13:38:32.0555021Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0555034Z 2025-12-04T13:38:32.0555111Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0555342Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0555344Z 2025-12-04T13:38:32.0555436Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0555499Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0555563Z ====================== 1 failed, 32 deselected in 57.84s ======================= 2025-12-04T13:38:32.0555602Z Got exit code 1 2025-12-04T13:38:32.0555642Z Retrying single test... 2025-12-04T13:38:32.0555833Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e63260897d77ebfd.xml 2025-12-04T13:38:32.0555892Z ============================= test session starts ============================== 2025-12-04T13:38:32.0556022Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0556063Z cachedir: .pytest_cache 2025-12-04T13:38:32.0556222Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0556269Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0556311Z configfile: pytest.ini 2025-12-04T13:38:32.0556477Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0556552Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0556776Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0556820Z Running 1 items in this shard 2025-12-04T13:38:32.0556822Z 2025-12-04T13:38:32.0557131Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda I1204 13:07:38.814000 387723 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 387792 2025-12-04T13:38:32.0557289Z I1204 13:07:38.815000 387723 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 387793 2025-12-04T13:38:32.0557451Z I1204 13:07:38.815000 387723 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 387794 2025-12-04T13:38:32.0557606Z I1204 13:07:38.816000 387723 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 387795 2025-12-04T13:38:32.0558188Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0558226Z _warn_cpu_init() 2025-12-04T13:38:32.0558808Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0558845Z _warn_cpu_init() 2025-12-04T13:38:32.0559412Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0559463Z _warn_cpu_init() 2025-12-04T13:38:32.0560065Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0560104Z _warn_cpu_init() 2025-12-04T13:38:32.0560399Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0560460Z return func(*args, **kwargs) 2025-12-04T13:38:32.0560602Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0560767Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0561057Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0561211Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0561497Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0561623Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0561916Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0562065Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0562342Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0562494Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0562771Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0562924Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0563201Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0563351Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0563829Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0563962Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0564159Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0564517Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0564633Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0564856Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0565023Z [rank1]:E1204 13:08:34.714000 387793 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0565062Z dist init r=1, world=4 2025-12-04T13:38:32.0565413Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0565572Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0565860Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0566017Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0566301Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0566437Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0566713Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0566861Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0567138Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0567299Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0567579Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0567714Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0567992Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0568152Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0568631Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0568748Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0568942Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0569300Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0569431Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0569685Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0569851Z [rank2]:E1204 13:08:34.730000 387794 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0569892Z dist init r=2, world=4 2025-12-04T13:38:32.0570028Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0570192Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0570481Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0570652Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0570939Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0571062Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0571341Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0571489Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0571781Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0571927Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0572207Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0572356Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0572634Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0572784Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0573262Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0573379Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0573590Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0573947Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0574062Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0574273Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0574439Z [rank0]:E1204 13:08:34.762000 387792 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0574478Z dist init r=0, world=4 2025-12-04T13:38:32.0574617Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0574777Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0575076Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0575231Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0575516Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0575644Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0575933Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0576084Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0576359Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0576509Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0576796Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0576935Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0577214Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0577361Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0577841Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0577970Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0578168Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0578526Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0578638Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0578851Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0579016Z [rank3]:E1204 13:08:34.783000 387795 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0579056Z dist init r=3, world=4 2025-12-04T13:38:32.0579401Z [rank0]:[W1204 13:08:35.002714555 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0579444Z FAILED [57.7867s] [100%] 2025-12-04T13:38:32.0579446Z 2025-12-04T13:38:32.0579501Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0579648Z __ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda ___ 2025-12-04T13:38:32.0579696Z Traceback (most recent call last): 2025-12-04T13:38:32.0579863Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0579905Z self._join_processes(fn) 2025-12-04T13:38:32.0580093Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0580150Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0580327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0580372Z raise RuntimeError(error) 2025-12-04T13:38:32.0580451Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0580498Z Traceback (most recent call last): 2025-12-04T13:38:32.0580659Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0580716Z getattr(self, test_name)() 2025-12-04T13:38:32.0580874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0580910Z fn() 2025-12-04T13:38:32.0581063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0581105Z method(*args, **kwargs) 2025-12-04T13:38:32.0581255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0581297Z method(*args, **kwargs) 2025-12-04T13:38:32.0581448Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0581487Z with policy(): 2025-12-04T13:38:32.0581640Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0581697Z raise RuntimeError(msg) 2025-12-04T13:38:32.0582048Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0582052Z 2025-12-04T13:38:32.0582127Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0582360Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0582362Z 2025-12-04T13:38:32.0582448Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0582451Z 2025-12-04T13:38:32.0582452Z 2025-12-04T13:38:32.0582529Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0582618Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0582851Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e63260897d77ebfd.xml - 2025-12-04T13:38:32.0582911Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0583183Z FAILED [57.7867s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0583228Z Traceback (most recent call last): 2025-12-04T13:38:32.0583392Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0583434Z getattr(self, test_name)() 2025-12-04T13:38:32.0583596Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0583633Z fn() 2025-12-04T13:38:32.0583786Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0583827Z method(*args, **kwargs) 2025-12-04T13:38:32.0583990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0584032Z method(*args, **kwargs) 2025-12-04T13:38:32.0584184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0584224Z with policy(): 2025-12-04T13:38:32.0584377Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0584420Z raise RuntimeError(msg) 2025-12-04T13:38:32.0584790Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0584794Z 2025-12-04T13:38:32.0584870Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0585101Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0585103Z 2025-12-04T13:38:32.0585190Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0585253Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0585317Z ====================== 1 failed, 32 deselected in 57.95s ======================= 2025-12-04T13:38:32.0585357Z Got exit code 1 2025-12-04T13:38:32.0585546Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda 2025-12-04T13:38:32.0585675Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0585866Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-1e18d1fda7ad478d.xml 2025-12-04T13:38:32.0585925Z ============================= test session starts ============================== 2025-12-04T13:38:32.0586039Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0586082Z cachedir: .pytest_cache 2025-12-04T13:38:32.0586238Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0586286Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0586327Z configfile: pytest.ini 2025-12-04T13:38:32.0586493Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0586566Z collecting ... collected 60 items / 9 deselected / 51 selected 2025-12-04T13:38:32.0586624Z stepcurrent: skipping 9 already run items. 2025-12-04T13:38:32.0586667Z Running 24 items in this shard 2025-12-04T13:38:32.0586669Z 2025-12-04T13:38:32.0587000Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda I1204 13:08:39.178000 388125 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 388194 2025-12-04T13:38:32.0587156Z I1204 13:08:39.179000 388125 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 388195 2025-12-04T13:38:32.0587310Z I1204 13:08:39.179000 388125 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 388196 2025-12-04T13:38:32.0587466Z I1204 13:08:39.180000 388125 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 388197 2025-12-04T13:38:32.0588055Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0588095Z _warn_cpu_init() 2025-12-04T13:38:32.0588664Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0588715Z _warn_cpu_init() 2025-12-04T13:38:32.0589286Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0589323Z _warn_cpu_init() 2025-12-04T13:38:32.0589928Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0589982Z _warn_cpu_init() 2025-12-04T13:38:32.0590280Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0590322Z return func(*args, **kwargs) 2025-12-04T13:38:32.0590467Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0590630Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0590917Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0591075Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0591375Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0591503Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0591786Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0591938Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0592233Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0592380Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0592656Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0592791Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0593085Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0593234Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0593727Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0593844Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0594040Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0594432Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0594548Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0594762Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0594926Z [rank0]:E1204 13:09:35.016000 388194 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0594967Z dist init r=0, world=4 2025-12-04T13:38:32.0595104Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0595266Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0595565Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0595720Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0596010Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0596134Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0596416Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0596575Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0596855Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0597004Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0597279Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0597428Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0597706Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0597857Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0598345Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0598476Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0598674Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0599045Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0599159Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0599370Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0599536Z [rank2]:E1204 13:09:35.031000 388196 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0599612Z dist init r=2, world=4 2025-12-04T13:38:32.0599755Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0599929Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0600216Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0600371Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0600656Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0600799Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0601078Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0601226Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0601500Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0601662Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0601940Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0602076Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0602353Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0602501Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0603005Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0603121Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0603317Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0603693Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0603807Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0604021Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0604196Z [rank3]:E1204 13:09:35.078000 388197 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0604237Z dist init r=3, world=4 2025-12-04T13:38:32.0604373Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0604534Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0604819Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0604975Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0605274Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0605397Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0605676Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0605836Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0606117Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0606264Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0606540Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0606677Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0606952Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0607114Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0607603Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0607720Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0607918Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0608294Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0608420Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0608632Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0608797Z [rank1]:E1204 13:09:35.083000 388195 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0608834Z dist init r=1, world=4 2025-12-04T13:38:32.0609176Z [rank0]:[W1204 13:09:35.206508919 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0609217Z FAILED [57.7858s] [ 4%] 2025-12-04T13:38:32.0609219Z 2025-12-04T13:38:32.0609288Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0609399Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.0609448Z Traceback (most recent call last): 2025-12-04T13:38:32.0609649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0609693Z self._join_processes(fn) 2025-12-04T13:38:32.0609868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0609936Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0610117Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0610160Z raise RuntimeError(error) 2025-12-04T13:38:32.0610242Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0610287Z Traceback (most recent call last): 2025-12-04T13:38:32.0610450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0610491Z getattr(self, test_name)() 2025-12-04T13:38:32.0610651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0610685Z fn() 2025-12-04T13:38:32.0610838Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0610879Z method(*args, **kwargs) 2025-12-04T13:38:32.0611046Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0611085Z method(*args, **kwargs) 2025-12-04T13:38:32.0611238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0611275Z with policy(): 2025-12-04T13:38:32.0611431Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0611471Z raise RuntimeError(msg) 2025-12-04T13:38:32.0611837Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0611840Z 2025-12-04T13:38:32.0611918Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0612162Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0612164Z 2025-12-04T13:38:32.0612254Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0612257Z 2025-12-04T13:38:32.0612271Z 2025-12-04T13:38:32.0612346Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0612435Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0612672Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-1e18d1fda7ad478d.xml - 2025-12-04T13:38:32.0612733Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0612993Z FAILED [57.7858s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0613041Z Traceback (most recent call last): 2025-12-04T13:38:32.0613220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0613263Z getattr(self, test_name)() 2025-12-04T13:38:32.0613425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0613459Z fn() 2025-12-04T13:38:32.0613612Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0613651Z method(*args, **kwargs) 2025-12-04T13:38:32.0613804Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0614928Z method(*args, **kwargs) 2025-12-04T13:38:32.0615080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0615117Z with policy(): 2025-12-04T13:38:32.0615272Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0615313Z raise RuntimeError(msg) 2025-12-04T13:38:32.0615678Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0615681Z 2025-12-04T13:38:32.0615755Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0616003Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0616020Z 2025-12-04T13:38:32.0616109Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0616173Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0616238Z ======================= 1 failed, 9 deselected in 57.93s ======================= 2025-12-04T13:38:32.0616275Z Got exit code 1 2025-12-04T13:38:32.0616317Z Retrying single test... 2025-12-04T13:38:32.0616507Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-74a9a6fcf6eb4745.xml 2025-12-04T13:38:32.0616567Z ============================= test session starts ============================== 2025-12-04T13:38:32.0616679Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0616724Z cachedir: .pytest_cache 2025-12-04T13:38:32.0616882Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0616931Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0616971Z configfile: pytest.ini 2025-12-04T13:38:32.0617156Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0617233Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0617470Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0617513Z Running 1 items in this shard 2025-12-04T13:38:32.0617515Z 2025-12-04T13:38:32.0617838Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda I1204 13:09:39.608000 388527 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 388596 2025-12-04T13:38:32.0617998Z I1204 13:09:39.608000 388527 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 388597 2025-12-04T13:38:32.0618161Z I1204 13:09:39.609000 388527 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 388598 2025-12-04T13:38:32.0618317Z I1204 13:09:39.609000 388527 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 388599 2025-12-04T13:38:32.0618898Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0618950Z _warn_cpu_init() 2025-12-04T13:38:32.0619517Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0619557Z _warn_cpu_init() 2025-12-04T13:38:32.0620170Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0620223Z _warn_cpu_init() 2025-12-04T13:38:32.0620794Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0620831Z _warn_cpu_init() 2025-12-04T13:38:32.0621124Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0621170Z return func(*args, **kwargs) 2025-12-04T13:38:32.0621314Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0621479Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0621782Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0621939Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0622224Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0622352Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0622643Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0622798Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0623076Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0623222Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0623518Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0623655Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0623939Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0624086Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0624581Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0624709Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0624905Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0625280Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0625393Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0625606Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0625771Z [rank3]:E1204 13:10:36.113000 388599 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0625812Z dist init r=3, world=4 2025-12-04T13:38:32.0625961Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0626119Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0626410Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0626565Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0626864Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0626987Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0627265Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0627413Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0627692Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0627853Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0628129Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0628267Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0628546Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0628697Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0629195Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0629311Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0629507Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0629919Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0630035Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0630261Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0630426Z [rank2]:E1204 13:10:36.143000 388598 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0630465Z dist init r=2, world=4 2025-12-04T13:38:32.0630605Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0630766Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0631054Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0631222Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0631506Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0631632Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0631911Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0632076Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0632355Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0632501Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0637920Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0638073Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0638406Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0638555Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0639050Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0639165Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0639367Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0639810Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0639925Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0640141Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0640304Z [rank0]:E1204 13:10:36.171000 388596 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0640348Z dist init r=0, world=4 2025-12-04T13:38:32.0640486Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0640666Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0640953Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0641113Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0641402Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0641543Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0641827Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0641976Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0642257Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0642404Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0642685Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0642840Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0643120Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0643270Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0643758Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0643877Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0644086Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0644459Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0644576Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0644788Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0644958Z [rank1]:E1204 13:10:36.197000 388597 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0645007Z dist init r=1, world=4 2025-12-04T13:38:32.0645350Z [rank0]:[W1204 13:10:36.425966146 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0645393Z FAILED [58.3860s] [100%] 2025-12-04T13:38:32.0645395Z 2025-12-04T13:38:32.0645455Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0645565Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.0645627Z Traceback (most recent call last): 2025-12-04T13:38:32.0645797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0645841Z self._join_processes(fn) 2025-12-04T13:38:32.0646019Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0646075Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0646256Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0646299Z raise RuntimeError(error) 2025-12-04T13:38:32.0646381Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0646426Z Traceback (most recent call last): 2025-12-04T13:38:32.0646592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0646646Z getattr(self, test_name)() 2025-12-04T13:38:32.0646808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0646843Z fn() 2025-12-04T13:38:32.0646999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0647041Z method(*args, **kwargs) 2025-12-04T13:38:32.0647194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0647233Z method(*args, **kwargs) 2025-12-04T13:38:32.0647386Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0647424Z with policy(): 2025-12-04T13:38:32.0647580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0647622Z raise RuntimeError(msg) 2025-12-04T13:38:32.0647994Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0647996Z 2025-12-04T13:38:32.0648086Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0648333Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0648336Z 2025-12-04T13:38:32.0648428Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0648430Z 2025-12-04T13:38:32.0648431Z 2025-12-04T13:38:32.0648509Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0648601Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0648841Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-74a9a6fcf6eb4745.xml - 2025-12-04T13:38:32.0648915Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0649184Z FAILED [58.3860s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0649233Z Traceback (most recent call last): 2025-12-04T13:38:32.0649400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0649444Z getattr(self, test_name)() 2025-12-04T13:38:32.0649644Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0649682Z fn() 2025-12-04T13:38:32.0649838Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0649879Z method(*args, **kwargs) 2025-12-04T13:38:32.0650034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0650075Z method(*args, **kwargs) 2025-12-04T13:38:32.0650227Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0650264Z with policy(): 2025-12-04T13:38:32.0650421Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0650463Z raise RuntimeError(msg) 2025-12-04T13:38:32.0650829Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0650848Z 2025-12-04T13:38:32.0650925Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0651173Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0651175Z 2025-12-04T13:38:32.0651263Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0651327Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0651395Z ====================== 1 failed, 32 deselected in 58.55s ======================= 2025-12-04T13:38:32.0651433Z Got exit code 1 2025-12-04T13:38:32.0651476Z Retrying single test... 2025-12-04T13:38:32.0651668Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-575314950bfef071.xml 2025-12-04T13:38:32.0651731Z ============================= test session starts ============================== 2025-12-04T13:38:32.0651847Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0651903Z cachedir: .pytest_cache 2025-12-04T13:38:32.0652063Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0652114Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0652154Z configfile: pytest.ini 2025-12-04T13:38:32.0652323Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0652398Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0652640Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0652683Z Running 1 items in this shard 2025-12-04T13:38:32.0652686Z 2025-12-04T13:38:32.0653023Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda I1204 13:10:40.594000 388929 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 388998 2025-12-04T13:38:32.0653181Z I1204 13:10:40.594000 388929 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 388999 2025-12-04T13:38:32.0653338Z I1204 13:10:40.595000 388929 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 389000 2025-12-04T13:38:32.0653493Z I1204 13:10:40.595000 388929 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 389001 2025-12-04T13:38:32.0654106Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0654146Z _warn_cpu_init() 2025-12-04T13:38:32.0654717Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0654769Z _warn_cpu_init() 2025-12-04T13:38:32.0655342Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0655379Z _warn_cpu_init() 2025-12-04T13:38:32.0655946Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0655984Z _warn_cpu_init() 2025-12-04T13:38:32.0656297Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0656342Z return func(*args, **kwargs) 2025-12-04T13:38:32.0656487Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0656652Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0656939Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0657097Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0657392Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0657519Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0657795Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0657945Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0658235Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0658381Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0658661Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0658797Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0659076Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0659235Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0659773Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0659893Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0660087Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0660460Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0660575Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0660803Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0660968Z [rank2]:E1204 13:11:36.224000 389000 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0661012Z dist init r=2, world=4 2025-12-04T13:38:32.0661150Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0661316Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0661622Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0661777Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0662065Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0662188Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0662480Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0662631Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0662912Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0663061Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0663337Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0663491Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0663769Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0663921Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0664408Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0664526Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0664724Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0665103Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0665219Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0665429Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0665597Z [rank0]:E1204 13:11:36.267000 388998 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0665638Z dist init r=0, world=4 2025-12-04T13:38:32.0665779Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0665951Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0666238Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0666395Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0666678Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0666817Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0667095Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0667245Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0667520Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0667670Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0667966Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0668103Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0668380Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0668528Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0669015Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0669135Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0669342Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0669770Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0669885Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0670098Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0670285Z [rank1]:E1204 13:11:36.269000 388999 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0670327Z dist init r=1, world=4 2025-12-04T13:38:32.0670465Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0670627Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0670921Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0671088Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0671378Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0671501Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0671780Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0671926Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0672225Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0672375Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0672655Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0672795Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0673071Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0673225Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0673726Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0673844Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0674041Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0674411Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0674538Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0674751Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0674920Z [rank3]:E1204 13:11:36.276000 389001 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0674959Z dist init r=3, world=4 2025-12-04T13:38:32.0675300Z [rank0]:[W1204 13:11:36.506596506 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0675355Z FAILED [57.4837s] [100%] 2025-12-04T13:38:32.0675359Z 2025-12-04T13:38:32.0675416Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0675529Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.0675575Z Traceback (most recent call last): 2025-12-04T13:38:32.0675741Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0675785Z self._join_processes(fn) 2025-12-04T13:38:32.0675959Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0676013Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0676196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0676250Z raise RuntimeError(error) 2025-12-04T13:38:32.0676332Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.0676378Z Traceback (most recent call last): 2025-12-04T13:38:32.0676544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0676586Z getattr(self, test_name)() 2025-12-04T13:38:32.0676749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0676783Z fn() 2025-12-04T13:38:32.0676939Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0676979Z method(*args, **kwargs) 2025-12-04T13:38:32.0677135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0677175Z method(*args, **kwargs) 2025-12-04T13:38:32.0677329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0677367Z with policy(): 2025-12-04T13:38:32.0677539Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0677580Z raise RuntimeError(msg) 2025-12-04T13:38:32.0677947Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0677950Z 2025-12-04T13:38:32.0678027Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0678275Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0678278Z 2025-12-04T13:38:32.0678378Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0678380Z 2025-12-04T13:38:32.0678382Z 2025-12-04T13:38:32.0678459Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0678549Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0678782Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-575314950bfef071.xml - 2025-12-04T13:38:32.0678846Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0679111Z FAILED [57.4837s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.0679171Z Traceback (most recent call last): 2025-12-04T13:38:32.0679339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0679381Z getattr(self, test_name)() 2025-12-04T13:38:32.0679544Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0679626Z fn() 2025-12-04T13:38:32.0679781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0679821Z method(*args, **kwargs) 2025-12-04T13:38:32.0679974Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0680015Z method(*args, **kwargs) 2025-12-04T13:38:32.0680184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0680221Z with policy(): 2025-12-04T13:38:32.0680378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0680421Z raise RuntimeError(msg) 2025-12-04T13:38:32.0680787Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0680789Z 2025-12-04T13:38:32.0680865Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0681112Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0681116Z 2025-12-04T13:38:32.0681207Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0681270Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0681338Z ====================== 1 failed, 32 deselected in 57.65s ======================= 2025-12-04T13:38:32.0681391Z Got exit code 1 2025-12-04T13:38:32.0681587Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.0681716Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0681908Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-49f8d9fac1f68b0e.xml 2025-12-04T13:38:32.0681967Z ============================= test session starts ============================== 2025-12-04T13:38:32.0682084Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0682125Z cachedir: .pytest_cache 2025-12-04T13:38:32.0682299Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0682346Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0682391Z configfile: pytest.ini 2025-12-04T13:38:32.0682555Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0682632Z collecting ... collected 60 items / 10 deselected / 50 selected 2025-12-04T13:38:32.0682685Z stepcurrent: skipping 10 already run items. 2025-12-04T13:38:32.0682733Z Running 23 items in this shard 2025-12-04T13:38:32.0682735Z 2025-12-04T13:38:32.0683053Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda I1204 13:11:40.736000 389331 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 389400 2025-12-04T13:38:32.0683223Z I1204 13:11:40.737000 389331 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 389401 2025-12-04T13:38:32.0683380Z I1204 13:11:40.737000 389331 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 389402 2025-12-04T13:38:32.0683531Z I1204 13:11:40.738000 389331 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 389403 2025-12-04T13:38:32.0683830Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0683881Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0684173Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0684236Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0684819Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0684860Z _warn_cpu_init() 2025-12-04T13:38:32.0685433Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0685477Z _warn_cpu_init() 2025-12-04T13:38:32.0685782Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0685836Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0686402Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0686443Z _warn_cpu_init() 2025-12-04T13:38:32.0686744Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0686825Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0687113Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0687190Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0687478Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0687564Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0687860Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0687905Z return func(*args, **kwargs) 2025-12-04T13:38:32.0688191Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0688242Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0688819Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0688872Z _warn_cpu_init() 2025-12-04T13:38:32.0689161Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0689236Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0689466Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0689511Z return func(*args, **kwargs) 2025-12-04T13:38:32.0689788Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0689829Z return func(*args, **kwargs) 2025-12-04T13:38:32.0690081Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0690121Z return func(*args, **kwargs) 2025-12-04T13:38:32.0690345Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0690385Z return func(*args, **kwargs) 2025-12-04T13:38:32.0690607Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0690648Z return func(*args, **kwargs) 2025-12-04T13:38:32.0690869Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0690909Z return func(*args, **kwargs) 2025-12-04T13:38:32.0691147Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0691188Z return func(*args, **kwargs) 2025-12-04T13:38:32.0691409Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0691450Z return func(*args, **kwargs) 2025-12-04T13:38:32.0691597Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0691782Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0692075Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0692236Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0692524Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0692651Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0692929Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0693101Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0693383Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0693532Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0693810Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0693950Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0694229Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0694387Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0694879Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0694998Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0695194Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0695569Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0695683Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0695899Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0696075Z [rank0]:E1204 13:11:48.547000 389400 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0696116Z dist init r=0, world=4 2025-12-04T13:38:32.0696254Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0696415Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0696702Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0696855Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0697142Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0697277Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0697557Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0697705Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0697985Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0698135Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0698411Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0698559Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0698838Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0698988Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0699468Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0699641Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0699839Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0700523Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0700655Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0700869Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0701035Z [rank1]:E1204 13:11:48.550000 389401 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0701075Z dist init r=1, world=4 2025-12-04T13:38:32.0701215Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0701375Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0701661Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0701837Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0702121Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0702247Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0702522Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0702670Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0702947Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0703097Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0703390Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0703526Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0703804Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0703954Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0704447Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0704562Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0704758Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0705134Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0705249Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0705467Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0705631Z [rank3]:E1204 13:11:48.574000 389403 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0705672Z dist init r=3, world=4 2025-12-04T13:38:32.0705810Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0705987Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0706305Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0706459Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0706745Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0706867Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0707150Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0707298Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0707592Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0707742Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0708017Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0708155Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0708444Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0708596Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0709075Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0709200Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0709396Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0709793Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0709907Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0710119Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0710287Z [rank2]:E1204 13:11:48.583000 389402 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0710339Z dist init r=2, world=4 2025-12-04T13:38:32.0710680Z [rank0]:[W1204 13:11:48.804148515 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0710719Z FAILED [9.8222s] [ 4%] 2025-12-04T13:38:32.0710725Z 2025-12-04T13:38:32.0710781Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0710885Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda _ 2025-12-04T13:38:32.0710931Z Traceback (most recent call last): 2025-12-04T13:38:32.0711098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0711143Z self._join_processes(fn) 2025-12-04T13:38:32.0711318Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0711372Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0711555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0711599Z raise RuntimeError(error) 2025-12-04T13:38:32.0711692Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0711738Z Traceback (most recent call last): 2025-12-04T13:38:32.0711901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0711942Z getattr(self, test_name)() 2025-12-04T13:38:32.0712104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0712139Z fn() 2025-12-04T13:38:32.0712293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0712335Z method(*args, **kwargs) 2025-12-04T13:38:32.0712504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0712543Z method(*args, **kwargs) 2025-12-04T13:38:32.0712698Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0712738Z with policy(): 2025-12-04T13:38:32.0712893Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0712933Z raise RuntimeError(msg) 2025-12-04T13:38:32.0713293Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0713310Z 2025-12-04T13:38:32.0713386Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0713626Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0713628Z 2025-12-04T13:38:32.0713718Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0713720Z 2025-12-04T13:38:32.0713722Z 2025-12-04T13:38:32.0713797Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0713885Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0714118Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-49f8d9fac1f68b0e.xml - 2025-12-04T13:38:32.0714192Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0714448Z FAILED [9.8222s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0714495Z Traceback (most recent call last): 2025-12-04T13:38:32.0714663Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0714706Z getattr(self, test_name)() 2025-12-04T13:38:32.0714868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0714902Z fn() 2025-12-04T13:38:32.0715056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0715098Z method(*args, **kwargs) 2025-12-04T13:38:32.0715252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0715291Z method(*args, **kwargs) 2025-12-04T13:38:32.0715445Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0715491Z with policy(): 2025-12-04T13:38:32.0715647Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0715687Z raise RuntimeError(msg) 2025-12-04T13:38:32.0716047Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0716051Z 2025-12-04T13:38:32.0716125Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0716362Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0716375Z 2025-12-04T13:38:32.0716462Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0716525Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0716588Z ======================= 1 failed, 10 deselected in 9.96s ======================= 2025-12-04T13:38:32.0716625Z Got exit code 1 2025-12-04T13:38:32.0716668Z Retrying single test... 2025-12-04T13:38:32.0716860Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-14230de52c25d103.xml 2025-12-04T13:38:32.0716939Z ============================= test session starts ============================== 2025-12-04T13:38:32.0717053Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0717096Z cachedir: .pytest_cache 2025-12-04T13:38:32.0717254Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0717301Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0717342Z configfile: pytest.ini 2025-12-04T13:38:32.0717505Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0717579Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0717810Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0717854Z Running 1 items in this shard 2025-12-04T13:38:32.0717867Z 2025-12-04T13:38:32.0718180Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda I1204 13:11:53.447000 389733 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 389802 2025-12-04T13:38:32.0718337Z I1204 13:11:53.448000 389733 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 389803 2025-12-04T13:38:32.0718491Z I1204 13:11:53.448000 389733 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 389804 2025-12-04T13:38:32.0718643Z I1204 13:11:53.449000 389733 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 389805 2025-12-04T13:38:32.0718937Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0718991Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0719279Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0719330Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0719956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0719995Z _warn_cpu_init() 2025-12-04T13:38:32.0720284Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0720333Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0720915Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0720952Z _warn_cpu_init() 2025-12-04T13:38:32.0721240Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0721302Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0721880Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0721919Z _warn_cpu_init() 2025-12-04T13:38:32.0722493Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0722548Z _warn_cpu_init() 2025-12-04T13:38:32.0722836Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0722916Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0723203Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0723280Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0723568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0723641Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0723929Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0724011Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0724305Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0724348Z return func(*args, **kwargs) 2025-12-04T13:38:32.0724578Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0724622Z return func(*args, **kwargs) 2025-12-04T13:38:32.0724846Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0724897Z return func(*args, **kwargs) 2025-12-04T13:38:32.0725121Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0725163Z return func(*args, **kwargs) 2025-12-04T13:38:32.0725382Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0725424Z return func(*args, **kwargs) 2025-12-04T13:38:32.0725642Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0725696Z return func(*args, **kwargs) 2025-12-04T13:38:32.0725915Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0725956Z return func(*args, **kwargs) 2025-12-04T13:38:32.0726176Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0726219Z return func(*args, **kwargs) 2025-12-04T13:38:32.0726439Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0726481Z return func(*args, **kwargs) 2025-12-04T13:38:32.0726628Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0726804Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0727096Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0727250Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0727538Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0727662Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0727941Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0728100Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0728377Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0728525Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0728805Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0728945Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0729233Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0729383Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0729900Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0730032Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0730228Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0730592Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0730708Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0730919Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0731099Z [rank0]:E1204 13:12:01.106000 389802 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0731138Z dist init r=0, world=4 2025-12-04T13:38:32.0731277Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0731436Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0731724Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0731879Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0732167Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0732293Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0732583Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0732732Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0733007Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0733157Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0733456Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0733592Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0733869Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0734016Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0734516Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0734631Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0734826Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0735189Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0735315Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0735530Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0735695Z [rank2]:E1204 13:12:01.107000 389804 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0735735Z dist init r=2, world=4 2025-12-04T13:38:32.0735872Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0736032Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0736320Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0736476Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0736773Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0736895Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0737176Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0737324Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0737615Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0737762Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0738041Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0738180Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0738466Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0738615Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0739096Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 112128 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0739211Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0739407Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0739830Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0739945Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0740155Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0740321Z [rank1]:E1204 13:12:01.153000 389803 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0740361Z dist init r=1, world=4 2025-12-04T13:38:32.0740505Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0740665Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0740969Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0741122Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0741410Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0741535Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0741814Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0741976Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0742251Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0742400Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0742674Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0742826Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0743107Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0743255Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0743736Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 112128 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0743863Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0744061Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0744423Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0744534Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0744746Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0744913Z [rank3]:E1204 13:12:01.159000 389805 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0744952Z dist init r=3, world=4 2025-12-04T13:38:32.0745301Z [rank0]:[W1204 13:12:01.309315532 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0745343Z FAILED [9.6202s] [100%] 2025-12-04T13:38:32.0745345Z 2025-12-04T13:38:32.0745401Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0745502Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda _ 2025-12-04T13:38:32.0745548Z Traceback (most recent call last): 2025-12-04T13:38:32.0745713Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0745757Z self._join_processes(fn) 2025-12-04T13:38:32.0745934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0746000Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0746180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0746225Z raise RuntimeError(error) 2025-12-04T13:38:32.0746305Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0746352Z Traceback (most recent call last): 2025-12-04T13:38:32.0746513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0746557Z getattr(self, test_name)() 2025-12-04T13:38:32.0746727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0746764Z fn() 2025-12-04T13:38:32.0746916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0746959Z method(*args, **kwargs) 2025-12-04T13:38:32.0747112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0747154Z method(*args, **kwargs) 2025-12-04T13:38:32.0747304Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0747344Z with policy(): 2025-12-04T13:38:32.0747500Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0747542Z raise RuntimeError(msg) 2025-12-04T13:38:32.0747902Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0747914Z 2025-12-04T13:38:32.0747992Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0748232Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0748234Z 2025-12-04T13:38:32.0748321Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0748323Z 2025-12-04T13:38:32.0748384Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.0748428Z Traceback (most recent call last): 2025-12-04T13:38:32.0748593Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0748635Z getattr(self, test_name)() 2025-12-04T13:38:32.0748796Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0748830Z fn() 2025-12-04T13:38:32.0748984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0749040Z method(*args, **kwargs) 2025-12-04T13:38:32.0749193Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0749233Z method(*args, **kwargs) 2025-12-04T13:38:32.0749384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0749420Z with policy(): 2025-12-04T13:38:32.0749615Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0749658Z raise RuntimeError(msg) 2025-12-04T13:38:32.0750033Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0750035Z 2025-12-04T13:38:32.0750113Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0750348Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0750350Z 2025-12-04T13:38:32.0750437Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0750439Z 2025-12-04T13:38:32.0750454Z 2025-12-04T13:38:32.0750529Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0750619Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0750853Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-14230de52c25d103.xml - 2025-12-04T13:38:32.0750915Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0751169Z FAILED [9.6202s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0751217Z Traceback (most recent call last): 2025-12-04T13:38:32.0751383Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0751424Z getattr(self, test_name)() 2025-12-04T13:38:32.0751590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0751637Z fn() 2025-12-04T13:38:32.0751792Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0751831Z method(*args, **kwargs) 2025-12-04T13:38:32.0751984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0752023Z method(*args, **kwargs) 2025-12-04T13:38:32.0752176Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0752211Z with policy(): 2025-12-04T13:38:32.0752366Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0752405Z raise RuntimeError(msg) 2025-12-04T13:38:32.0752763Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0752767Z 2025-12-04T13:38:32.0752841Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0753091Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0753093Z 2025-12-04T13:38:32.0753182Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0753184Z 2025-12-04T13:38:32.0753242Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.0753290Z Traceback (most recent call last): 2025-12-04T13:38:32.0753453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0753498Z getattr(self, test_name)() 2025-12-04T13:38:32.0753658Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0753695Z fn() 2025-12-04T13:38:32.0753857Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0753898Z method(*args, **kwargs) 2025-12-04T13:38:32.0754048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0754089Z method(*args, **kwargs) 2025-12-04T13:38:32.0754350Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0754388Z with policy(): 2025-12-04T13:38:32.0754553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0754596Z raise RuntimeError(msg) 2025-12-04T13:38:32.0754958Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0754961Z 2025-12-04T13:38:32.0755035Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0755274Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0755276Z 2025-12-04T13:38:32.0755362Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0755426Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0755500Z ======================= 1 failed, 32 deselected in 9.78s ======================= 2025-12-04T13:38:32.0755539Z Got exit code 1 2025-12-04T13:38:32.0755579Z Retrying single test... 2025-12-04T13:38:32.0755773Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-604e6d382b28e77b.xml 2025-12-04T13:38:32.0755831Z ============================= test session starts ============================== 2025-12-04T13:38:32.0755947Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0755988Z cachedir: .pytest_cache 2025-12-04T13:38:32.0756148Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0756195Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0756236Z configfile: pytest.ini 2025-12-04T13:38:32.0756402Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0756480Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0756709Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0756755Z Running 1 items in this shard 2025-12-04T13:38:32.0756767Z 2025-12-04T13:38:32.0757078Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda I1204 13:12:05.485000 390135 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 390204 2025-12-04T13:38:32.0757236Z I1204 13:12:05.486000 390135 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 390205 2025-12-04T13:38:32.0757391Z I1204 13:12:05.487000 390135 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 390206 2025-12-04T13:38:32.0757543Z I1204 13:12:05.487000 390135 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 390207 2025-12-04T13:38:32.0757849Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0757901Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0758483Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0758532Z _warn_cpu_init() 2025-12-04T13:38:32.0758823Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0758874Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0759446Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0759484Z _warn_cpu_init() 2025-12-04T13:38:32.0759816Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0759909Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0760197Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0760274Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0760561Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0760609Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0761189Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0761240Z _warn_cpu_init() 2025-12-04T13:38:32.0761533Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0761575Z return func(*args, **kwargs) 2025-12-04T13:38:32.0761863Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0761938Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0762240Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0762290Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.0762863Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0762918Z _warn_cpu_init() 2025-12-04T13:38:32.0763205Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0763282Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0763513Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0763557Z return func(*args, **kwargs) 2025-12-04T13:38:32.0763780Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0763821Z return func(*args, **kwargs) 2025-12-04T13:38:32.0764044Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0764102Z return func(*args, **kwargs) 2025-12-04T13:38:32.0764325Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.0764366Z return func(*args, **kwargs) 2025-12-04T13:38:32.0764589Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0764629Z return func(*args, **kwargs) 2025-12-04T13:38:32.0764849Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0764889Z return func(*args, **kwargs) 2025-12-04T13:38:32.0765110Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0765152Z return func(*args, **kwargs) 2025-12-04T13:38:32.0765377Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.0765417Z return func(*args, **kwargs) 2025-12-04T13:38:32.0765592Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0765758Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0766050Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0766211Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0766506Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0766634Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0766911Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0767061Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0767350Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0767504Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0767788Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0767926Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0768206Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0768365Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0768855Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 112128 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0768973Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0769168Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0769533Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0769688Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0769921Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0770087Z [rank0]:E1204 13:12:13.248000 390204 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0770128Z dist init r=0, world=4 2025-12-04T13:38:32.0770266Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0770428Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0770722Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0770888Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0771174Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0771297Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0771574Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0771740Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0772019Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0772167Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0772444Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0772584Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0772874Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0773025Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0773508Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 99840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.0773623Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0773822Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0774196Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0774312Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0774525Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0774692Z [rank2]:E1204 13:12:13.251000 390206 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0774732Z dist init r=2, world=4 2025-12-04T13:38:32.0774871Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0775039Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0775330Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0775484Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0775766Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0775903Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0776180Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0776329Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0776606Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0776754Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0777045Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0777182Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0777461Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0777608Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0778091Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 99840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.0778206Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0778413Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0778779Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0778891Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0779107Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0779271Z [rank3]:E1204 13:12:13.319000 390207 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0779323Z dist init r=3, world=4 2025-12-04T13:38:32.0779461Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0779657Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0779944Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0780118Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0780406Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0780529Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0780807Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0780955Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0781235Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0781396Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0781677Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0781815Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0782092Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0782242Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0782740Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 103936 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.0782856Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0783050Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0783416Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0783532Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0783756Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0783924Z [rank1]:E1204 13:12:13.319000 390205 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0783961Z dist init r=1, world=4 2025-12-04T13:38:32.0784302Z [rank0]:[W1204 13:12:13.497086933 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0784353Z FAILED [9.9195s] [100%] 2025-12-04T13:38:32.0784355Z 2025-12-04T13:38:32.0784413Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0784513Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda _ 2025-12-04T13:38:32.0784562Z Traceback (most recent call last): 2025-12-04T13:38:32.0784727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0784772Z self._join_processes(fn) 2025-12-04T13:38:32.0784946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0785000Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0785181Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0785225Z raise RuntimeError(error) 2025-12-04T13:38:32.0785319Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0785364Z Traceback (most recent call last): 2025-12-04T13:38:32.0785528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0785571Z getattr(self, test_name)() 2025-12-04T13:38:32.0785732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0785767Z fn() 2025-12-04T13:38:32.0785920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0785961Z method(*args, **kwargs) 2025-12-04T13:38:32.0786114Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0786155Z method(*args, **kwargs) 2025-12-04T13:38:32.0786310Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0786346Z with policy(): 2025-12-04T13:38:32.0786503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0786544Z raise RuntimeError(msg) 2025-12-04T13:38:32.0786917Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 112128 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0786920Z 2025-12-04T13:38:32.0786997Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0787234Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0787238Z 2025-12-04T13:38:32.0787328Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0787330Z 2025-12-04T13:38:32.0787332Z 2025-12-04T13:38:32.0787418Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0787508Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0787743Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-604e6d382b28e77b.xml - 2025-12-04T13:38:32.0787805Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0788056Z FAILED [9.9195s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0788117Z Traceback (most recent call last): 2025-12-04T13:38:32.0788282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0788326Z getattr(self, test_name)() 2025-12-04T13:38:32.0788488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0788522Z fn() 2025-12-04T13:38:32.0788677Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0788717Z method(*args, **kwargs) 2025-12-04T13:38:32.0788871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0788910Z method(*args, **kwargs) 2025-12-04T13:38:32.0789063Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0789112Z with policy(): 2025-12-04T13:38:32.0789266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0789306Z raise RuntimeError(msg) 2025-12-04T13:38:32.0789696Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 112128 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.0789698Z 2025-12-04T13:38:32.0789773Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0790010Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0790013Z 2025-12-04T13:38:32.0790100Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0790166Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0790229Z ====================== 1 failed, 32 deselected in 10.08s ======================= 2025-12-04T13:38:32.0790266Z Got exit code 1 2025-12-04T13:38:32.0790453Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda 2025-12-04T13:38:32.0790609Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0790803Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6a82e7f2d48533a4.xml 2025-12-04T13:38:32.0790861Z ============================= test session starts ============================== 2025-12-04T13:38:32.0790978Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0791021Z cachedir: .pytest_cache 2025-12-04T13:38:32.0791182Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0791227Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0791269Z configfile: pytest.ini 2025-12-04T13:38:32.0791449Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0791526Z collecting ... collected 60 items / 11 deselected / 49 selected 2025-12-04T13:38:32.0791578Z stepcurrent: skipping 11 already run items. 2025-12-04T13:38:32.0791623Z Running 22 items in this shard 2025-12-04T13:38:32.0791625Z 2025-12-04T13:38:32.0791933Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda I1204 13:12:18.128000 390537 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 390606 2025-12-04T13:38:32.0792108Z I1204 13:12:18.129000 390537 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 390607 2025-12-04T13:38:32.0792263Z I1204 13:12:18.130000 390537 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 390608 2025-12-04T13:38:32.0792414Z I1204 13:12:18.131000 390537 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 390609 2025-12-04T13:38:32.0793103Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0793142Z _warn_cpu_init() 2025-12-04T13:38:32.0793736Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0793773Z _warn_cpu_init() 2025-12-04T13:38:32.0794345Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0794385Z _warn_cpu_init() 2025-12-04T13:38:32.0794966Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0795006Z _warn_cpu_init() 2025-12-04T13:38:32.0795298Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0795343Z return func(*args, **kwargs) 2025-12-04T13:38:32.0795491Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0795654Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0795966Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0796121Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0796406Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0796530Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0796823Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0796971Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0797252Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0797400Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0797675Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0797826Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0798104Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0798254Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0798732Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0798850Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0799048Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0799419Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0799535Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0799784Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0799951Z [rank0]:E1204 13:13:13.999000 390606 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0799989Z dist init r=0, world=4 2025-12-04T13:38:32.0800145Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0800308Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0800597Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0800752Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0801053Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0801181Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0801458Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0801608Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0801883Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0802059Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0802336Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0802472Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0802750Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0802898Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0803385Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0803518Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0803712Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0804071Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0804185Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0804398Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0804572Z [rank2]:E1204 13:13:13.999000 390608 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0804613Z dist init r=2, world=4 2025-12-04T13:38:32.0804750Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0804913Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0805204Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0805368Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0805655Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0805777Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0806055Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0806201Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0806489Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0806637Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0806912Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0807048Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0807325Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0807476Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0807963Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0808079Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0808275Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0808631Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0808755Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0808966Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0809132Z [rank1]:E1204 13:13:14.004000 390607 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0809169Z dist init r=1, world=4 2025-12-04T13:38:32.0809311Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0809483Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0809806Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0809967Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0810253Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0810380Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0810663Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0810828Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0811106Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0811255Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0811540Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0811678Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0811959Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0812125Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0812605Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0812721Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0812922Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0813298Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0813411Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0813625Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0813802Z [rank3]:E1204 13:13:14.055000 390609 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0813846Z dist init r=3, world=4 2025-12-04T13:38:32.0814184Z [rank0]:[W1204 13:13:14.170152696 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0814230Z FAILED [57.7880s] [ 4%] 2025-12-04T13:38:32.0814233Z 2025-12-04T13:38:32.0814290Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0814392Z ___ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda ___ 2025-12-04T13:38:32.0814439Z Traceback (most recent call last): 2025-12-04T13:38:32.0814605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0814653Z self._join_processes(fn) 2025-12-04T13:38:32.0814837Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0814896Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0815076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0815124Z raise RuntimeError(error) 2025-12-04T13:38:32.0815204Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0815253Z Traceback (most recent call last): 2025-12-04T13:38:32.0815417Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0815464Z getattr(self, test_name)() 2025-12-04T13:38:32.0815624Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0815664Z fn() 2025-12-04T13:38:32.0815817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0815861Z method(*args, **kwargs) 2025-12-04T13:38:32.0816014Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0816058Z method(*args, **kwargs) 2025-12-04T13:38:32.0816219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0816264Z with policy(): 2025-12-04T13:38:32.0816418Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0816463Z raise RuntimeError(msg) 2025-12-04T13:38:32.0816816Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0816823Z 2025-12-04T13:38:32.0816898Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0817146Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0817150Z 2025-12-04T13:38:32.0817239Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0817241Z 2025-12-04T13:38:32.0817243Z 2025-12-04T13:38:32.0817320Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0817409Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0817646Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6a82e7f2d48533a4.xml - 2025-12-04T13:38:32.0817720Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0817972Z FAILED [57.7880s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0818019Z Traceback (most recent call last): 2025-12-04T13:38:32.0818187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0818233Z getattr(self, test_name)() 2025-12-04T13:38:32.0818395Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0818433Z fn() 2025-12-04T13:38:32.0818585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0818641Z method(*args, **kwargs) 2025-12-04T13:38:32.0818794Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0818836Z method(*args, **kwargs) 2025-12-04T13:38:32.0818987Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0819029Z with policy(): 2025-12-04T13:38:32.0819182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0819226Z raise RuntimeError(msg) 2025-12-04T13:38:32.0819615Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0819619Z 2025-12-04T13:38:32.0819699Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0819932Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0819935Z 2025-12-04T13:38:32.0820027Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0820113Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0820179Z ====================== 1 failed, 11 deselected in 57.93s ======================= 2025-12-04T13:38:32.0820222Z Got exit code 1 2025-12-04T13:38:32.0820263Z Retrying single test... 2025-12-04T13:38:32.0820456Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f3d78b10e04e870d.xml 2025-12-04T13:38:32.0820515Z ============================= test session starts ============================== 2025-12-04T13:38:32.0820633Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0820674Z cachedir: .pytest_cache 2025-12-04T13:38:32.0820850Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0820897Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0820943Z configfile: pytest.ini 2025-12-04T13:38:32.0821106Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0821183Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0821408Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0821469Z Running 1 items in this shard 2025-12-04T13:38:32.0821472Z 2025-12-04T13:38:32.0821778Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda I1204 13:13:18.584000 390939 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 391008 2025-12-04T13:38:32.0821937Z I1204 13:13:18.585000 390939 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 391009 2025-12-04T13:38:32.0822089Z I1204 13:13:18.585000 390939 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 391010 2025-12-04T13:38:32.0822245Z I1204 13:13:18.586000 390939 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 391011 2025-12-04T13:38:32.0822835Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0822890Z _warn_cpu_init() 2025-12-04T13:38:32.0823463Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0823502Z _warn_cpu_init() 2025-12-04T13:38:32.0823800Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0823848Z return func(*args, **kwargs) 2025-12-04T13:38:32.0824437Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0824481Z _warn_cpu_init() 2025-12-04T13:38:32.0825050Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0825093Z _warn_cpu_init() 2025-12-04T13:38:32.0825246Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0825414Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0825708Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0825864Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0826162Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0826289Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0826569Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0826716Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0826997Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0827165Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0827444Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0827585Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0827862Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0828013Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0828498Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0828629Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0828828Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0829185Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0829304Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0829516Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0829738Z [rank1]:E1204 13:14:14.648000 391009 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0829778Z dist init r=1, world=4 2025-12-04T13:38:32.0829919Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0830078Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0830370Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0830542Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0830829Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0830956Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0831234Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0831387Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0831677Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0831828Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0832114Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0832249Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0832529Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0832680Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0833173Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0833292Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0833489Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0833851Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0833976Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0834192Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0834358Z [rank3]:E1204 13:14:14.652000 391011 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0834401Z dist init r=3, world=4 2025-12-04T13:38:32.0834539Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0834714Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0835005Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0835160Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0835448Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0835571Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0835852Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0836011Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0836295Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0836441Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0836723Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0836863Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0837141Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0837302Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0837777Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0837894Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0838094Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0838462Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0838579Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0838792Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0838969Z [rank0]:E1204 13:14:14.655000 391008 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0839009Z dist init r=0, world=4 2025-12-04T13:38:32.0839150Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0839312Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0839643Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0839802Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0840088Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0840228Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0840507Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0840656Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0840933Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0841083Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0841366Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0841517Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0841799Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0841948Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0842428Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0842557Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0842756Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0843116Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0843229Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0843464Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0843628Z [rank2]:E1204 13:14:14.656000 391010 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0843670Z dist init r=2, world=4 2025-12-04T13:38:32.0844006Z [rank0]:[W1204 13:14:14.819319352 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0844050Z FAILED [57.9883s] [100%] 2025-12-04T13:38:32.0844052Z 2025-12-04T13:38:32.0844109Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0844213Z ___ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda ___ 2025-12-04T13:38:32.0844270Z Traceback (most recent call last): 2025-12-04T13:38:32.0844437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0844484Z self._join_processes(fn) 2025-12-04T13:38:32.0844659Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0844718Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0844897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0844944Z raise RuntimeError(error) 2025-12-04T13:38:32.0845024Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0845073Z Traceback (most recent call last): 2025-12-04T13:38:32.0845236Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0845285Z getattr(self, test_name)() 2025-12-04T13:38:32.0845444Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0845483Z fn() 2025-12-04T13:38:32.0845649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0845694Z method(*args, **kwargs) 2025-12-04T13:38:32.0845847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0845892Z method(*args, **kwargs) 2025-12-04T13:38:32.0846047Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0846088Z with policy(): 2025-12-04T13:38:32.0846245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0846293Z raise RuntimeError(msg) 2025-12-04T13:38:32.0846657Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0846664Z 2025-12-04T13:38:32.0846740Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0846974Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0846977Z 2025-12-04T13:38:32.0847064Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0847077Z 2025-12-04T13:38:32.0847142Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.0847188Z Traceback (most recent call last): 2025-12-04T13:38:32.0847355Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0847397Z getattr(self, test_name)() 2025-12-04T13:38:32.0847560Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0847595Z fn() 2025-12-04T13:38:32.0847749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0847789Z method(*args, **kwargs) 2025-12-04T13:38:32.0847944Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0847984Z method(*args, **kwargs) 2025-12-04T13:38:32.0848142Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0848191Z with policy(): 2025-12-04T13:38:32.0848347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0848388Z raise RuntimeError(msg) 2025-12-04T13:38:32.0848741Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0848744Z 2025-12-04T13:38:32.0848821Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0849050Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0849053Z 2025-12-04T13:38:32.0849144Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0849146Z 2025-12-04T13:38:32.0849205Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0849254Z Traceback (most recent call last): 2025-12-04T13:38:32.0849418Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0849475Z getattr(self, test_name)() 2025-12-04T13:38:32.0849676Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0849715Z fn() 2025-12-04T13:38:32.0849866Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0849911Z method(*args, **kwargs) 2025-12-04T13:38:32.0850060Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0850107Z method(*args, **kwargs) 2025-12-04T13:38:32.0850257Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0850299Z with policy(): 2025-12-04T13:38:32.0850471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0850515Z raise RuntimeError(msg) 2025-12-04T13:38:32.0850870Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0850873Z 2025-12-04T13:38:32.0850948Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0851197Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0851201Z 2025-12-04T13:38:32.0851286Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0851288Z 2025-12-04T13:38:32.0851290Z 2025-12-04T13:38:32.0851371Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0851460Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0851702Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f3d78b10e04e870d.xml - 2025-12-04T13:38:32.0851766Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0852014Z FAILED [57.9883s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0852078Z Traceback (most recent call last): 2025-12-04T13:38:32.0852245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0852290Z getattr(self, test_name)() 2025-12-04T13:38:32.0852453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0852491Z fn() 2025-12-04T13:38:32.0852643Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0852687Z method(*args, **kwargs) 2025-12-04T13:38:32.0852839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0852882Z method(*args, **kwargs) 2025-12-04T13:38:32.0853035Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0853076Z with policy(): 2025-12-04T13:38:32.0853229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0853274Z raise RuntimeError(msg) 2025-12-04T13:38:32.0853637Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0853643Z 2025-12-04T13:38:32.0853717Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0853951Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0853954Z 2025-12-04T13:38:32.0854043Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0854045Z 2025-12-04T13:38:32.0854108Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.0854153Z Traceback (most recent call last): 2025-12-04T13:38:32.0854332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0854376Z getattr(self, test_name)() 2025-12-04T13:38:32.0854540Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0854575Z fn() 2025-12-04T13:38:32.0854728Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0854768Z method(*args, **kwargs) 2025-12-04T13:38:32.0854921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0854972Z method(*args, **kwargs) 2025-12-04T13:38:32.0855126Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0855164Z with policy(): 2025-12-04T13:38:32.0855321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0855363Z raise RuntimeError(msg) 2025-12-04T13:38:32.0855718Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0855721Z 2025-12-04T13:38:32.0855797Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0856028Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0856040Z 2025-12-04T13:38:32.0856131Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0856133Z 2025-12-04T13:38:32.0856192Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0856242Z Traceback (most recent call last): 2025-12-04T13:38:32.0856406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0856450Z getattr(self, test_name)() 2025-12-04T13:38:32.0856610Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0856649Z fn() 2025-12-04T13:38:32.0856800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0856845Z method(*args, **kwargs) 2025-12-04T13:38:32.0856996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0857039Z method(*args, **kwargs) 2025-12-04T13:38:32.0857191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0857232Z with policy(): 2025-12-04T13:38:32.0857396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0857438Z raise RuntimeError(msg) 2025-12-04T13:38:32.0857793Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0857796Z 2025-12-04T13:38:32.0857870Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0858101Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0858103Z 2025-12-04T13:38:32.0858205Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0858275Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0858339Z ====================== 1 failed, 32 deselected in 58.15s ======================= 2025-12-04T13:38:32.0858380Z Got exit code 1 2025-12-04T13:38:32.0858420Z Retrying single test... 2025-12-04T13:38:32.0858616Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7fe17ceee5b62d33.xml 2025-12-04T13:38:32.0858675Z ============================= test session starts ============================== 2025-12-04T13:38:32.0858804Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0858846Z cachedir: .pytest_cache 2025-12-04T13:38:32.0859008Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0859056Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0859100Z configfile: pytest.ini 2025-12-04T13:38:32.0859267Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0859341Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0859606Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0859650Z Running 1 items in this shard 2025-12-04T13:38:32.0859653Z 2025-12-04T13:38:32.0859978Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda I1204 13:14:18.922000 391341 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 391410 2025-12-04T13:38:32.0860136Z I1204 13:14:18.923000 391341 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 391411 2025-12-04T13:38:32.0864606Z I1204 13:14:18.923000 391341 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 391412 2025-12-04T13:38:32.0864770Z I1204 13:14:18.924000 391341 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 391413 2025-12-04T13:38:32.0865359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0865400Z _warn_cpu_init() 2025-12-04T13:38:32.0866003Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0866046Z _warn_cpu_init() 2025-12-04T13:38:32.0866617Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0866664Z _warn_cpu_init() 2025-12-04T13:38:32.0866973Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0867021Z return func(*args, **kwargs) 2025-12-04T13:38:32.0867594Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0867648Z _warn_cpu_init() 2025-12-04T13:38:32.0867799Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0867965Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0868260Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0868416Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0868705Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0868847Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0869128Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0869279Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0869556Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0869737Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0870017Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0870174Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0870455Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0870604Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0871092Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0871223Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0871424Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0871783Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0871913Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0872131Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0872298Z [rank2]:E1204 13:15:14.743000 391412 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0872341Z dist init r=2, world=4 2025-12-04T13:38:32.0872481Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0872644Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0872932Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0873104Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0873393Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0873520Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0873800Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0873949Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0874231Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0874378Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0874670Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0874808Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0875087Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0875238Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0875728Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0875847Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0876043Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0876417Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0876535Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0876750Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0876918Z [rank3]:E1204 13:15:14.751000 391413 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0876958Z dist init r=3, world=4 2025-12-04T13:38:32.0877098Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0877262Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0877565Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0877719Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0878008Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0878132Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0878411Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0878562Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0878861Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0879011Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0879286Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0879428Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0879756Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0879908Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0880388Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0880515Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0880712Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0881069Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0881186Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0881400Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0881567Z [rank1]:E1204 13:15:14.790000 391411 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0881622Z dist init r=1, world=4 2025-12-04T13:38:32.0881759Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0881922Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0882209Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0882366Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0882651Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0882779Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0883075Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0883225Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0883503Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0883651Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0883929Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0884076Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0884365Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0884514Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0884993Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0885121Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0885318Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0885675Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0885787Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0886012Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0886176Z [rank0]:E1204 13:15:14.796000 391410 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0886217Z dist init r=0, world=4 2025-12-04T13:38:32.0886560Z [rank0]:[W1204 13:15:15.060813889 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0886600Z FAILED [57.7865s] [100%] 2025-12-04T13:38:32.0886603Z 2025-12-04T13:38:32.0886663Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0886762Z ___ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda ___ 2025-12-04T13:38:32.0886813Z Traceback (most recent call last): 2025-12-04T13:38:32.0886979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0887027Z self._join_processes(fn) 2025-12-04T13:38:32.0887202Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0887268Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0887447Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0887494Z raise RuntimeError(error) 2025-12-04T13:38:32.0887573Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0887622Z Traceback (most recent call last): 2025-12-04T13:38:32.0887784Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0887830Z getattr(self, test_name)() 2025-12-04T13:38:32.0887988Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0888025Z fn() 2025-12-04T13:38:32.0888187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0888231Z method(*args, **kwargs) 2025-12-04T13:38:32.0888381Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0888425Z method(*args, **kwargs) 2025-12-04T13:38:32.0888577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0888614Z with policy(): 2025-12-04T13:38:32.0888782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0888824Z raise RuntimeError(msg) 2025-12-04T13:38:32.0889180Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0889182Z 2025-12-04T13:38:32.0889259Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0889493Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0889495Z 2025-12-04T13:38:32.0889612Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0889614Z 2025-12-04T13:38:32.0889617Z 2025-12-04T13:38:32.0889697Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0889803Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0890039Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7fe17ceee5b62d33.xml - 2025-12-04T13:38:32.0890102Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0890353Z FAILED [57.7865s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0890401Z Traceback (most recent call last): 2025-12-04T13:38:32.0890566Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0890614Z getattr(self, test_name)() 2025-12-04T13:38:32.0890776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0890813Z fn() 2025-12-04T13:38:32.0890968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0891011Z method(*args, **kwargs) 2025-12-04T13:38:32.0891179Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0891222Z method(*args, **kwargs) 2025-12-04T13:38:32.0891373Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0891412Z with policy(): 2025-12-04T13:38:32.0891565Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0891608Z raise RuntimeError(msg) 2025-12-04T13:38:32.0891959Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0891965Z 2025-12-04T13:38:32.0892053Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0892289Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0892291Z 2025-12-04T13:38:32.0892377Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0892444Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0892506Z ====================== 1 failed, 32 deselected in 57.95s ======================= 2025-12-04T13:38:32.0892560Z Got exit code 1 2025-12-04T13:38:32.0892740Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda 2025-12-04T13:38:32.0892871Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0893060Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2db6c407e5a97404.xml 2025-12-04T13:38:32.0893123Z ============================= test session starts ============================== 2025-12-04T13:38:32.0893237Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0893281Z cachedir: .pytest_cache 2025-12-04T13:38:32.0893439Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0893490Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0893533Z configfile: pytest.ini 2025-12-04T13:38:32.0893715Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0893791Z collecting ... collected 60 items / 12 deselected / 48 selected 2025-12-04T13:38:32.0893844Z stepcurrent: skipping 12 already run items. 2025-12-04T13:38:32.0893892Z Running 21 items in this shard 2025-12-04T13:38:32.0893894Z 2025-12-04T13:38:32.0894215Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:15:19.545000 391743 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 391812 2025-12-04T13:38:32.0894374Z I1204 13:15:19.546000 391743 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 391813 2025-12-04T13:38:32.0894527Z I1204 13:15:19.546000 391743 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 391814 2025-12-04T13:38:32.0894682Z I1204 13:15:19.547000 391743 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 391815 2025-12-04T13:38:32.0895275Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0895315Z _warn_cpu_init() 2025-12-04T13:38:32.0895891Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0895930Z _warn_cpu_init() 2025-12-04T13:38:32.0896239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0896281Z return func(*args, **kwargs) 2025-12-04T13:38:32.0896858Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0896909Z _warn_cpu_init() 2025-12-04T13:38:32.0897483Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0897522Z _warn_cpu_init() 2025-12-04T13:38:32.0897667Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0897831Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0898123Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0898296Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0898585Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0898713Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0898992Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0899142Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0899429Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0899628Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0899907Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0900045Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0900327Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0900479Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0900981Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0901100Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0901296Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0901686Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0901803Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0902015Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0902183Z [rank0]:E1204 13:16:15.310000 391812 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0902222Z dist init r=0, world=4 2025-12-04T13:38:32.0902364Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0902537Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0902826Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0902979Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0903267Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0903395Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0903671Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0903822Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0904112Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0904261Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0904536Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0904679Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0904972Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0905119Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0905607Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0905733Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0905933Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0906304Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0906421Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0906637Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0906813Z [rank3]:E1204 13:16:15.363000 391815 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0906854Z dist init r=3, world=4 2025-12-04T13:38:32.0906992Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0907155Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0907441Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0907596Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0907882Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0908009Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0908296Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0908444Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0908725Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0908875Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0909163Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0909300Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0909622Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0909773Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0910283Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0910402Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0910597Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0910971Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0911099Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0911314Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0911482Z [rank1]:E1204 13:16:15.382000 391813 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0911520Z dist init r=1, world=4 2025-12-04T13:38:32.0911658Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0911818Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0912107Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0912263Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0912566Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0912690Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0912969Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0913119Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0913410Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0913561Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0913837Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0913974Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0914252Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0914415Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0914903Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0915017Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0915214Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0915593Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0915710Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0915921Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0916087Z [rank2]:E1204 13:16:15.388000 391814 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0916127Z dist init r=2, world=4 2025-12-04T13:38:32.0916463Z [rank0]:[W1204 13:16:15.495789825 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0916506Z FAILED [57.7779s] [ 4%] 2025-12-04T13:38:32.0916508Z 2025-12-04T13:38:32.0916565Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0916687Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.0916734Z Traceback (most recent call last): 2025-12-04T13:38:32.0916901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0916945Z self._join_processes(fn) 2025-12-04T13:38:32.0917119Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0917174Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0917356Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0917400Z raise RuntimeError(error) 2025-12-04T13:38:32.0917492Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0917538Z Traceback (most recent call last): 2025-12-04T13:38:32.0917702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0917744Z getattr(self, test_name)() 2025-12-04T13:38:32.0917907Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0917941Z fn() 2025-12-04T13:38:32.0918097Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0918152Z method(*args, **kwargs) 2025-12-04T13:38:32.0918306Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0918348Z method(*args, **kwargs) 2025-12-04T13:38:32.0918500Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0918540Z with policy(): 2025-12-04T13:38:32.0918693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0918736Z raise RuntimeError(msg) 2025-12-04T13:38:32.0919099Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0919103Z 2025-12-04T13:38:32.0919191Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0919434Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0919436Z 2025-12-04T13:38:32.0919527Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0919529Z 2025-12-04T13:38:32.0919635Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0919680Z Traceback (most recent call last): 2025-12-04T13:38:32.0919846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0919888Z getattr(self, test_name)() 2025-12-04T13:38:32.0920049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0920084Z fn() 2025-12-04T13:38:32.0920238Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0920278Z method(*args, **kwargs) 2025-12-04T13:38:32.0920433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0920473Z method(*args, **kwargs) 2025-12-04T13:38:32.0920639Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0920676Z with policy(): 2025-12-04T13:38:32.0920830Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0920870Z raise RuntimeError(msg) 2025-12-04T13:38:32.0921236Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0921240Z 2025-12-04T13:38:32.0921314Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0921571Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0921574Z 2025-12-04T13:38:32.0921663Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0921665Z 2025-12-04T13:38:32.0921667Z 2025-12-04T13:38:32.0921743Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0921833Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0922068Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2db6c407e5a97404.xml - 2025-12-04T13:38:32.0922145Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0922403Z FAILED [57.7779s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.0922453Z Traceback (most recent call last): 2025-12-04T13:38:32.0922618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0922663Z getattr(self, test_name)() 2025-12-04T13:38:32.0922827Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0922864Z fn() 2025-12-04T13:38:32.0923015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0923073Z method(*args, **kwargs) 2025-12-04T13:38:32.0923230Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0923270Z method(*args, **kwargs) 2025-12-04T13:38:32.0923426Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0923464Z with policy(): 2025-12-04T13:38:32.0923621Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0923662Z raise RuntimeError(msg) 2025-12-04T13:38:32.0924030Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0924034Z 2025-12-04T13:38:32.0924108Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0924352Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0924355Z 2025-12-04T13:38:32.0924443Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0924455Z 2025-12-04T13:38:32.0924519Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0924564Z Traceback (most recent call last): 2025-12-04T13:38:32.0924730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0924775Z getattr(self, test_name)() 2025-12-04T13:38:32.0924936Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0924975Z fn() 2025-12-04T13:38:32.0925127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0925169Z method(*args, **kwargs) 2025-12-04T13:38:32.0925338Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0925381Z method(*args, **kwargs) 2025-12-04T13:38:32.0925533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0925573Z with policy(): 2025-12-04T13:38:32.0925726Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0925770Z raise RuntimeError(msg) 2025-12-04T13:38:32.0926131Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0926145Z 2025-12-04T13:38:32.0926222Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0926463Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0926470Z 2025-12-04T13:38:32.0926557Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0926626Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0926688Z ====================== 1 failed, 12 deselected in 57.92s ======================= 2025-12-04T13:38:32.0926729Z Got exit code 1 2025-12-04T13:38:32.0926770Z Retrying single test... 2025-12-04T13:38:32.0926965Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f1234f2e29248a94.xml 2025-12-04T13:38:32.0927036Z ============================= test session starts ============================== 2025-12-04T13:38:32.0927153Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0927195Z cachedir: .pytest_cache 2025-12-04T13:38:32.0927360Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0927407Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0927454Z configfile: pytest.ini 2025-12-04T13:38:32.0927618Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0927695Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0927933Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0927981Z Running 1 items in this shard 2025-12-04T13:38:32.0927983Z 2025-12-04T13:38:32.0928309Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:16:20.163000 392145 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 392214 2025-12-04T13:38:32.0928470Z I1204 13:16:20.163000 392145 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 392215 2025-12-04T13:38:32.0928623Z I1204 13:16:20.164000 392145 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 392216 2025-12-04T13:38:32.0928774Z I1204 13:16:20.164000 392145 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 392217 2025-12-04T13:38:32.0929372Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0929415Z _warn_cpu_init() 2025-12-04T13:38:32.0930026Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0930083Z _warn_cpu_init() 2025-12-04T13:38:32.0930653Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0930694Z _warn_cpu_init() 2025-12-04T13:38:32.0930985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0931032Z return func(*args, **kwargs) 2025-12-04T13:38:32.0931606Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0931662Z _warn_cpu_init() 2025-12-04T13:38:32.0931809Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0931971Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0932268Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0932425Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0932714Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0932853Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0933134Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0933287Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0933568Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0933733Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0934008Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0934150Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0934426Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0934591Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0935085Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0935200Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0935398Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0935768Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0935897Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0936110Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0936278Z [rank3]:E1204 13:17:16.104000 392217 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0936318Z dist init r=3, world=4 2025-12-04T13:38:32.0936459Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0936621Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0936911Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0937082Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0937368Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0937497Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0937772Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0937926Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0938215Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0938362Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0938640Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0938787Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0939070Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0939221Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0939745Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0939865Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0940077Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0940450Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0940564Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0940777Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0940940Z [rank0]:E1204 13:17:16.133000 392214 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0940984Z dist init r=0, world=4 2025-12-04T13:38:32.0941121Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0941284Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0941596Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0941752Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0942040Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0942165Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0942457Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0942605Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0942887Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0943038Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0943329Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0943469Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0943749Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0943901Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0944390Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0944520Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0944719Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0945086Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0945203Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0945417Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0945585Z [rank1]:E1204 13:17:16.136000 392215 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0945624Z dist init r=1, world=4 2025-12-04T13:38:32.0945774Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0945934Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0946224Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0946384Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0946680Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0946809Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0947087Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0947238Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0947525Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0947681Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0947961Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0948098Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0948379Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0948529Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0949029Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0949146Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0949342Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0949765Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0949880Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0950109Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0950274Z [rank2]:E1204 13:17:16.161000 392216 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0950316Z dist init r=2, world=4 2025-12-04T13:38:32.0950651Z [rank0]:[W1204 13:17:16.378939219 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0950698Z FAILED [57.7858s] [100%] 2025-12-04T13:38:32.0950700Z 2025-12-04T13:38:32.0950760Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0950868Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.0950932Z Traceback (most recent call last): 2025-12-04T13:38:32.0951099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0951146Z self._join_processes(fn) 2025-12-04T13:38:32.0951320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0951378Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0951559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0951620Z raise RuntimeError(error) 2025-12-04T13:38:32.0951699Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0951748Z Traceback (most recent call last): 2025-12-04T13:38:32.0951911Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0951957Z getattr(self, test_name)() 2025-12-04T13:38:32.0952118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0952156Z fn() 2025-12-04T13:38:32.0952314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0952359Z method(*args, **kwargs) 2025-12-04T13:38:32.0952510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0952559Z method(*args, **kwargs) 2025-12-04T13:38:32.0952725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0952766Z with policy(): 2025-12-04T13:38:32.0952923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0952968Z raise RuntimeError(msg) 2025-12-04T13:38:32.0953336Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0953338Z 2025-12-04T13:38:32.0953412Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0953659Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0953663Z 2025-12-04T13:38:32.0953750Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0953753Z 2025-12-04T13:38:32.0953754Z 2025-12-04T13:38:32.0953834Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0953933Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0954168Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f1234f2e29248a94.xml - 2025-12-04T13:38:32.0954229Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0954492Z FAILED [57.7858s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.0954543Z Traceback (most recent call last): 2025-12-04T13:38:32.0954708Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0954754Z getattr(self, test_name)() 2025-12-04T13:38:32.0954926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0954966Z fn() 2025-12-04T13:38:32.0955120Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0955165Z method(*args, **kwargs) 2025-12-04T13:38:32.0955315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0955358Z method(*args, **kwargs) 2025-12-04T13:38:32.0955510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0955567Z with policy(): 2025-12-04T13:38:32.0955720Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0955764Z raise RuntimeError(msg) 2025-12-04T13:38:32.0956130Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0956132Z 2025-12-04T13:38:32.0956210Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0956454Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0956462Z 2025-12-04T13:38:32.0956566Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0956633Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0956696Z ====================== 1 failed, 32 deselected in 57.93s ======================= 2025-12-04T13:38:32.0956738Z Got exit code 1 2025-12-04T13:38:32.0956779Z Retrying single test... 2025-12-04T13:38:32.0956973Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3cb93b52e3d78be7.xml 2025-12-04T13:38:32.0957031Z ============================= test session starts ============================== 2025-12-04T13:38:32.0957147Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0957189Z cachedir: .pytest_cache 2025-12-04T13:38:32.0957352Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0957400Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0957444Z configfile: pytest.ini 2025-12-04T13:38:32.0957608Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0957686Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.0957934Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0957981Z Running 1 items in this shard 2025-12-04T13:38:32.0957983Z 2025-12-04T13:38:32.0958300Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:17:20.850000 392547 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 392616 2025-12-04T13:38:32.0958460Z I1204 13:17:20.850000 392547 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 392617 2025-12-04T13:38:32.0958617Z I1204 13:17:20.851000 392547 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 392618 2025-12-04T13:38:32.0958778Z I1204 13:17:20.851000 392547 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 392619 2025-12-04T13:38:32.0959363Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0959401Z _warn_cpu_init() 2025-12-04T13:38:32.0960015Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0960058Z _warn_cpu_init() 2025-12-04T13:38:32.0960629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0960672Z _warn_cpu_init() 2025-12-04T13:38:32.0961255Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0961297Z _warn_cpu_init() 2025-12-04T13:38:32.0961593Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.0961636Z return func(*args, **kwargs) 2025-12-04T13:38:32.0961782Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0961947Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0962239Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0962407Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0962695Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0962821Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0963103Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0963267Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0963543Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0963693Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0963971Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0964126Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0964404Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0964556Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0965045Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432. 2025-12-04T13:38:32.0965173Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0965372Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0965745Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0965862Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0966074Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0966244Z [rank0]:E1204 13:18:16.796000 392616 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.0966288Z dist init r=0, world=4 2025-12-04T13:38:32.0966428Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0966601Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0966890Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0967047Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0967331Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0967459Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0967747Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0967899Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0968180Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0968339Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0968620Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0968758Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0969037Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0969187Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0969710Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336. 2025-12-04T13:38:32.0969845Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0970044Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0970414Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0970530Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0970745Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0970924Z [rank2]:E1204 13:18:16.803000 392618 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.0970967Z dist init r=2, world=4 2025-12-04T13:38:32.0971107Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0971266Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0971556Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0971712Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0972013Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0972137Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0972421Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0972569Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0972868Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0973019Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0973298Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0973437Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0973720Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0973884Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0974370Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0974488Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0974686Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0975058Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0975177Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0975398Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0975566Z [rank1]:E1204 13:18:16.818000 392617 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.0975604Z dist init r=1, world=4 2025-12-04T13:38:32.0975747Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.0975911Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.0976209Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0976366Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.0976650Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0976776Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.0977063Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0977215Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0977495Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0977642Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.0977921Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0978069Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.0978351Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0978500Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.0978990Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688. 2025-12-04T13:38:32.0979109Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0979304Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0979718Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0979832Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.0980045Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0980212Z [rank3]:E1204 13:18:16.826000 392619 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.0980255Z dist init r=3, world=4 2025-12-04T13:38:32.0980607Z [rank0]:[W1204 13:18:17.057057895 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.0980652Z FAILED [57.8792s] [100%] 2025-12-04T13:38:32.0980654Z 2025-12-04T13:38:32.0980714Z =================================== FAILURES =================================== 2025-12-04T13:38:32.0980824Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.0980873Z Traceback (most recent call last): 2025-12-04T13:38:32.0981038Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.0981099Z self._join_processes(fn) 2025-12-04T13:38:32.0981275Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.0981334Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.0981516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.0981564Z raise RuntimeError(error) 2025-12-04T13:38:32.0981643Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0981693Z Traceback (most recent call last): 2025-12-04T13:38:32.0981856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0981902Z getattr(self, test_name)() 2025-12-04T13:38:32.0982061Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0982115Z fn() 2025-12-04T13:38:32.0982268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0982314Z method(*args, **kwargs) 2025-12-04T13:38:32.0982467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0982512Z method(*args, **kwargs) 2025-12-04T13:38:32.0982667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0982707Z with policy(): 2025-12-04T13:38:32.0982864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0982909Z raise RuntimeError(msg) 2025-12-04T13:38:32.0983276Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0983280Z 2025-12-04T13:38:32.0983356Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0983612Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0983614Z 2025-12-04T13:38:32.0983703Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0983706Z 2025-12-04T13:38:32.0983707Z 2025-12-04T13:38:32.0983786Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.0983875Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.0984113Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3cb93b52e3d78be7.xml - 2025-12-04T13:38:32.0984178Z =========================== short test summary info ============================ 2025-12-04T13:38:32.0984447Z FAILED [57.8792s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.0984497Z Traceback (most recent call last): 2025-12-04T13:38:32.0984662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.0984707Z getattr(self, test_name)() 2025-12-04T13:38:32.0984868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.0984910Z fn() 2025-12-04T13:38:32.0985075Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0985121Z method(*args, **kwargs) 2025-12-04T13:38:32.0985277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.0985321Z method(*args, **kwargs) 2025-12-04T13:38:32.0985475Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.0985516Z with policy(): 2025-12-04T13:38:32.0985670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.0985715Z raise RuntimeError(msg) 2025-12-04T13:38:32.0986077Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552. 2025-12-04T13:38:32.0986094Z 2025-12-04T13:38:32.0986169Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.0986412Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0986414Z 2025-12-04T13:38:32.0986502Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.0986569Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.0986632Z ====================== 1 failed, 32 deselected in 58.02s ======================= 2025-12-04T13:38:32.0986674Z Got exit code 1 2025-12-04T13:38:32.0986870Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.0987001Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.0987192Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8a4f83dd6075c6e2.xml 2025-12-04T13:38:32.0987254Z ============================= test session starts ============================== 2025-12-04T13:38:32.0987380Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.0987425Z cachedir: .pytest_cache 2025-12-04T13:38:32.0987585Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.0987636Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.0987677Z configfile: pytest.ini 2025-12-04T13:38:32.0987842Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.0987917Z collecting ... collected 60 items / 13 deselected / 47 selected 2025-12-04T13:38:32.0987974Z stepcurrent: skipping 13 already run items. 2025-12-04T13:38:32.0988018Z Running 20 items in this shard 2025-12-04T13:38:32.0988023Z 2025-12-04T13:38:32.0988355Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:18:21.513000 392949 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 393018 2025-12-04T13:38:32.0988514Z I1204 13:18:21.514000 392949 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 393019 2025-12-04T13:38:32.0988667Z I1204 13:18:21.515000 392949 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 393020 2025-12-04T13:38:32.0988821Z I1204 13:18:21.515000 392949 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 393021 2025-12-04T13:38:32.0989411Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0989454Z _warn_cpu_init() 2025-12-04T13:38:32.0989985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.0990049Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.0990645Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0990683Z _warn_cpu_init() 2025-12-04T13:38:32.0991175Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.0991240Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.0991829Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0991870Z _warn_cpu_init() 2025-12-04T13:38:32.0992361Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.0992425Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.0993017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.0993055Z _warn_cpu_init() 2025-12-04T13:38:32.0993352Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0993436Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.0993743Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0993823Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0994321Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.0994382Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.0994669Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0994766Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.0995256Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.0995318Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.0995605Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0995686Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0995972Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0996057Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.0996559Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.0996618Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.0996909Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0996986Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0997286Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0997367Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.0997856Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.0997928Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.0998219Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.0998298Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.0999568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.0999750Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.0999982Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1000029Z return func(*args, **kwargs) 2025-12-04T13:38:32.1001297Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1001424Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1001651Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1001698Z return func(*args, **kwargs) 2025-12-04T13:38:32.1002970Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1003107Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1003336Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1003378Z return func(*args, **kwargs) 2025-12-04T13:38:32.1004639Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1004775Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1005001Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1005046Z return func(*args, **kwargs) 2025-12-04T13:38:32.1005274Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1005316Z return func(*args, **kwargs) 2025-12-04T13:38:32.1005540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1005597Z return func(*args, **kwargs) 2025-12-04T13:38:32.1005820Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1005860Z return func(*args, **kwargs) 2025-12-04T13:38:32.1006081Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1006123Z return func(*args, **kwargs) 2025-12-04T13:38:32.1006417Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1006458Z return func(*args, **kwargs) 2025-12-04T13:38:32.1006617Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1006782Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1007081Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1007243Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1007540Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1007670Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1007951Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1008107Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1008385Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1008548Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1008828Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1008967Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1009250Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1009399Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1009930Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616. 2025-12-04T13:38:32.1010058Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1010259Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1010626Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1010742Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1010971Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1011140Z [rank2]:E1204 13:18:30.115000 393020 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1011183Z dist init r=2, world=4 2025-12-04T13:38:32.1011320Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1011483Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1011786Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1011953Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1012247Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1012372Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1012653Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1012815Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1013095Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1013242Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1013523Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1013663Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1013942Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1014095Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1014586Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:38:32.1014703Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1014899Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1015275Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1015394Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1015609Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1015776Z [rank1]:E1204 13:18:30.126000 393019 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1015816Z dist init r=1, world=4 2025-12-04T13:38:32.1015969Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1016130Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1016421Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1016576Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1016865Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1016993Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1017279Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1017430Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1017706Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1017856Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1018132Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1018273Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1018565Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1018714Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1019198Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:38:32.1019315Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1019526Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1019923Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1020036Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1020253Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1020432Z [rank3]:E1204 13:18:30.128000 393021 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1020475Z dist init r=3, world=4 2025-12-04T13:38:32.1020615Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1020778Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1021066Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1021222Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1021513Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1021659Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1021938Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1022085Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1022363Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1022511Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1022790Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1022938Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1023219Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1023370Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1023863Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:38:32.1023981Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1024177Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1024540Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1024668Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1024881Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1025050Z [rank0]:E1204 13:18:30.173000 393018 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1025090Z dist init r=0, world=4 2025-12-04T13:38:32.1025429Z [rank2]:[W1204 13:18:30.283400411 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1025760Z [rank1]:[W1204 13:18:30.297842878 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1026102Z [rank3]:[W1204 13:18:30.323723909 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1026433Z [rank0]:[W1204 13:18:30.454648269 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1026474Z FAILED [22.9370s] [ 5%] 2025-12-04T13:38:32.1026476Z 2025-12-04T13:38:32.1026536Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1026638Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___ 2025-12-04T13:38:32.1026690Z Traceback (most recent call last): 2025-12-04T13:38:32.1026856Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1026903Z self._join_processes(fn) 2025-12-04T13:38:32.1027088Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1027146Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1027326Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1027374Z raise RuntimeError(error) 2025-12-04T13:38:32.1027455Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1027504Z Traceback (most recent call last): 2025-12-04T13:38:32.1027668Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1027715Z getattr(self, test_name)() 2025-12-04T13:38:32.1027875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1027913Z fn() 2025-12-04T13:38:32.1028077Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1028122Z method(*args, **kwargs) 2025-12-04T13:38:32.1028277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1028318Z method(*args, **kwargs) 2025-12-04T13:38:32.1028474Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1028512Z with policy(): 2025-12-04T13:38:32.1028681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1028722Z raise RuntimeError(msg) 2025-12-04T13:38:32.1029081Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:38:32.1029083Z 2025-12-04T13:38:32.1029162Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1029399Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1029401Z 2025-12-04T13:38:32.1029490Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1029493Z 2025-12-04T13:38:32.1029498Z 2025-12-04T13:38:32.1029623Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1029714Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1029954Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8a4f83dd6075c6e2.xml - 2025-12-04T13:38:32.1030020Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1030271Z FAILED [22.9370s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1030321Z Traceback (most recent call last): 2025-12-04T13:38:32.1030489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1030537Z getattr(self, test_name)() 2025-12-04T13:38:32.1030700Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1030738Z fn() 2025-12-04T13:38:32.1030892Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1030936Z method(*args, **kwargs) 2025-12-04T13:38:32.1031104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1031149Z method(*args, **kwargs) 2025-12-04T13:38:32.1031300Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1031341Z with policy(): 2025-12-04T13:38:32.1031494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1031540Z raise RuntimeError(msg) 2025-12-04T13:38:32.1031901Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:38:32.1031907Z 2025-12-04T13:38:32.1031995Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1032234Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1032237Z 2025-12-04T13:38:32.1032324Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1032391Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1032454Z ====================== 1 failed, 13 deselected in 23.10s ======================= 2025-12-04T13:38:32.1032510Z Got exit code 1 2025-12-04T13:38:32.1032551Z Retrying single test... 2025-12-04T13:38:32.1032744Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-60c6062c7f204303.xml 2025-12-04T13:38:32.1032804Z ============================= test session starts ============================== 2025-12-04T13:38:32.1032924Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1032965Z cachedir: .pytest_cache 2025-12-04T13:38:32.1033129Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1033175Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1033221Z configfile: pytest.ini 2025-12-04T13:38:32.1033385Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1033464Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1033709Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1033752Z Running 1 items in this shard 2025-12-04T13:38:32.1033754Z 2025-12-04T13:38:32.1034066Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:18:46.955000 394215 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 394284 2025-12-04T13:38:32.1034222Z I1204 13:18:46.956000 394215 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 394285 2025-12-04T13:38:32.1034377Z I1204 13:18:46.956000 394215 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 394286 2025-12-04T13:38:32.1034528Z I1204 13:18:46.957000 394215 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 394287 2025-12-04T13:38:32.1035124Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1035166Z _warn_cpu_init() 2025-12-04T13:38:32.1035737Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1035781Z _warn_cpu_init() 2025-12-04T13:38:32.1036285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1036353Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1036840Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1036921Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1037493Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1037530Z _warn_cpu_init() 2025-12-04T13:38:32.1038023Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1038093Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1038665Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1038706Z _warn_cpu_init() 2025-12-04T13:38:32.1038997Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1039086Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1039650Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1039714Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1040007Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1040088Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1040376Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1040459Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1040764Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1040842Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1041333Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1041406Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1041699Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1041778Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1042269Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1042330Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1042619Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1042710Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1042998Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1043082Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1043578Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1043638Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1043928Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1044002Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1045299Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1045430Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1046690Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1046830Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1047062Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1047107Z return func(*args, **kwargs) 2025-12-04T13:38:32.1047345Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1047388Z return func(*args, **kwargs) 2025-12-04T13:38:32.1048641Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1048765Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1050087Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1050209Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1050444Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1050487Z return func(*args, **kwargs) 2025-12-04T13:38:32.1050712Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1050768Z return func(*args, **kwargs) 2025-12-04T13:38:32.1050990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1051032Z return func(*args, **kwargs) 2025-12-04T13:38:32.1051253Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1051294Z return func(*args, **kwargs) 2025-12-04T13:38:32.1051514Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1051555Z return func(*args, **kwargs) 2025-12-04T13:38:32.1051775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1051831Z return func(*args, **kwargs) 2025-12-04T13:38:32.1052125Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1052167Z return func(*args, **kwargs) 2025-12-04T13:38:32.1052313Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1052479Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1052775Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1052932Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1053220Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1053355Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1053634Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1053783Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1054061Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1054230Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1054509Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1054648Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1054925Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1055089Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1055574Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:38:32.1055692Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1055889Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1056253Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1056380Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1056594Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1056760Z [rank1]:E1204 13:18:55.448000 394285 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1056799Z dist init r=1, world=4 2025-12-04T13:38:32.1056938Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1057098Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1057391Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1057548Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1057844Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1057971Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1058247Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1058400Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1058687Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1058837Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1059114Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1059250Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1059541Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1059729Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1060213Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:38:32.1060329Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1060542Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1060911Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1061024Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1061236Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1061400Z [rank0]:E1204 13:18:55.449000 394284 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1061443Z dist init r=0, world=4 2025-12-04T13:38:32.1061579Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1061742Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1062041Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1062198Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1062483Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1062607Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1062899Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1063047Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1063325Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1063471Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1063764Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1063904Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1064182Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1064332Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1064811Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616. 2025-12-04T13:38:32.1064940Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1065139Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1065497Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1065612Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1065823Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1065990Z [rank2]:E1204 13:18:55.476000 394286 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1066029Z dist init r=2, world=4 2025-12-04T13:38:32.1066179Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1066337Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1066628Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1066787Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1067087Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1067213Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1067489Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1067638Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1067913Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1068076Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1068356Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1068494Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1068775Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1068923Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1069418Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:38:32.1069531Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1069762Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1070124Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1070238Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1070453Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1070638Z [rank3]:E1204 13:18:55.502000 394287 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1070678Z dist init r=3, world=4 2025-12-04T13:38:32.1071014Z [rank1]:[W1204 13:18:55.613843353 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1071346Z [rank0]:[W1204 13:18:55.635737655 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1071690Z [rank2]:[W1204 13:18:55.692616914 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1072017Z [rank3]:[W1204 13:18:55.748095530 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1072058Z FAILED [22.8387s] [100%] 2025-12-04T13:38:32.1072060Z 2025-12-04T13:38:32.1072131Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1072233Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___ 2025-12-04T13:38:32.1072279Z Traceback (most recent call last): 2025-12-04T13:38:32.1072449Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1072492Z self._join_processes(fn) 2025-12-04T13:38:32.1072672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1072725Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1072905Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1072947Z raise RuntimeError(error) 2025-12-04T13:38:32.1073031Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1073076Z Traceback (most recent call last): 2025-12-04T13:38:32.1073252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1073294Z getattr(self, test_name)() 2025-12-04T13:38:32.1073455Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1073491Z fn() 2025-12-04T13:38:32.1073643Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1073686Z method(*args, **kwargs) 2025-12-04T13:38:32.1073839Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1073880Z method(*args, **kwargs) 2025-12-04T13:38:32.1074032Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1074072Z with policy(): 2025-12-04T13:38:32.1074224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1074268Z raise RuntimeError(msg) 2025-12-04T13:38:32.1074634Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:38:32.1074636Z 2025-12-04T13:38:32.1074714Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1074947Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1074949Z 2025-12-04T13:38:32.1075039Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1075042Z 2025-12-04T13:38:32.1075103Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1075147Z Traceback (most recent call last): 2025-12-04T13:38:32.1075322Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1075363Z getattr(self, test_name)() 2025-12-04T13:38:32.1075526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1075560Z fn() 2025-12-04T13:38:32.1075714Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1075753Z method(*args, **kwargs) 2025-12-04T13:38:32.1075906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1075959Z method(*args, **kwargs) 2025-12-04T13:38:32.1076112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1076149Z with policy(): 2025-12-04T13:38:32.1076305Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1076345Z raise RuntimeError(msg) 2025-12-04T13:38:32.1076698Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:38:32.1076700Z 2025-12-04T13:38:32.1076774Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1077008Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1077025Z 2025-12-04T13:38:32.1077114Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1077116Z 2025-12-04T13:38:32.1077118Z 2025-12-04T13:38:32.1077193Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1077284Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1077516Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-60c6062c7f204303.xml - 2025-12-04T13:38:32.1077578Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1077828Z FAILED [22.8387s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1077876Z Traceback (most recent call last): 2025-12-04T13:38:32.1078040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1078083Z getattr(self, test_name)() 2025-12-04T13:38:32.1078245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1078281Z fn() 2025-12-04T13:38:32.1078443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1078486Z method(*args, **kwargs) 2025-12-04T13:38:32.1078638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1078679Z method(*args, **kwargs) 2025-12-04T13:38:32.1078832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1078870Z with policy(): 2025-12-04T13:38:32.1079024Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1079064Z raise RuntimeError(msg) 2025-12-04T13:38:32.1079431Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:38:32.1079433Z 2025-12-04T13:38:32.1079506Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1079772Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1079774Z 2025-12-04T13:38:32.1079876Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1079879Z 2025-12-04T13:38:32.1079940Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1079983Z Traceback (most recent call last): 2025-12-04T13:38:32.1080148Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1080189Z getattr(self, test_name)() 2025-12-04T13:38:32.1080351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1080387Z fn() 2025-12-04T13:38:32.1080536Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1080577Z method(*args, **kwargs) 2025-12-04T13:38:32.1080727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1080769Z method(*args, **kwargs) 2025-12-04T13:38:32.1080934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1080972Z with policy(): 2025-12-04T13:38:32.1081126Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1081168Z raise RuntimeError(msg) 2025-12-04T13:38:32.1081521Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:38:32.1081523Z 2025-12-04T13:38:32.1081597Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1081830Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1081834Z 2025-12-04T13:38:32.1081922Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1081986Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1082052Z ====================== 1 failed, 32 deselected in 23.00s ======================= 2025-12-04T13:38:32.1082103Z Got exit code 1 2025-12-04T13:38:32.1082144Z Retrying single test... 2025-12-04T13:38:32.1082335Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-81f9b84a4e1fc9ed.xml 2025-12-04T13:38:32.1082392Z ============================= test session starts ============================== 2025-12-04T13:38:32.1082507Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1082549Z cachedir: .pytest_cache 2025-12-04T13:38:32.1082710Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1082756Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1082797Z configfile: pytest.ini 2025-12-04T13:38:32.1082974Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1083051Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1083279Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1083324Z Running 1 items in this shard 2025-12-04T13:38:32.1083326Z 2025-12-04T13:38:32.1083635Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:19:12.268000 395481 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 395550 2025-12-04T13:38:32.1083804Z I1204 13:19:12.268000 395481 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 395551 2025-12-04T13:38:32.1083957Z I1204 13:19:12.269000 395481 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 395552 2025-12-04T13:38:32.1084111Z I1204 13:19:12.269000 395481 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 395553 2025-12-04T13:38:32.1084695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1084749Z _warn_cpu_init() 2025-12-04T13:38:32.1085329Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1085366Z _warn_cpu_init() 2025-12-04T13:38:32.1085864Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1085930Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1086431Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1086493Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1087065Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1087106Z _warn_cpu_init() 2025-12-04T13:38:32.1087609Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1087668Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1088250Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1088299Z _warn_cpu_init() 2025-12-04T13:38:32.1088596Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1088681Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1089174Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1089236Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1089535Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1089649Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1089938Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1090017Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1090307Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1090385Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1090881Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1090963Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1091252Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1091331Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1091825Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1091903Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1092193Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1092269Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1092554Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1092651Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1093143Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1093203Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1093491Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1093566Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1094843Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1094990Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1096259Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1096385Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1096629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1096673Z return func(*args, **kwargs) 2025-12-04T13:38:32.1096900Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1096942Z return func(*args, **kwargs) 2025-12-04T13:38:32.1098205Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1098339Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1098565Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1098620Z return func(*args, **kwargs) 2025-12-04T13:38:32.1099893Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1100017Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1100245Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1100309Z return func(*args, **kwargs) 2025-12-04T13:38:32.1100533Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1100573Z return func(*args, **kwargs) 2025-12-04T13:38:32.1100796Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1100837Z return func(*args, **kwargs) 2025-12-04T13:38:32.1101058Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1101100Z return func(*args, **kwargs) 2025-12-04T13:38:32.1101333Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1101374Z return func(*args, **kwargs) 2025-12-04T13:38:32.1101663Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1101705Z return func(*args, **kwargs) 2025-12-04T13:38:32.1101850Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1102040Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1102336Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1102495Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1102779Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1102906Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1103200Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1103350Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1103630Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1103777Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1104054Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1104193Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1104475Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1104639Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1105120Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616. 2025-12-04T13:38:32.1105239Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1105434Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1105808Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1105923Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1106135Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1106313Z [rank2]:E1204 13:19:20.721000 395552 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1106353Z dist init r=2, world=4 2025-12-04T13:38:32.1106491Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1106651Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1106942Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1107095Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1107380Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1107515Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1107796Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1107946Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1108221Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1108370Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1108648Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1108796Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1109074Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1109224Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1109742Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:38:32.1109871Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1110068Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1110426Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1110555Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1110768Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1110932Z [rank0]:E1204 13:19:20.729000 395550 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1110974Z dist init r=0, world=4 2025-12-04T13:38:32.1111111Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1111271Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1111565Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1111736Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1116184Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1116320Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1116600Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1116753Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1117035Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1117184Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1117490Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1117629Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1117911Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1118063Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1118555Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968. 2025-12-04T13:38:32.1118673Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1118867Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1119241Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1119356Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1119600Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1119764Z [rank3]:E1204 13:19:20.735000 395553 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1119806Z dist init r=3, world=4 2025-12-04T13:38:32.1119945Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1120107Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1120415Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1120570Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1120861Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1120985Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1121265Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1121414Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1121706Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1121856Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1122131Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1122272Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1122573Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1122725Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1123203Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832. 2025-12-04T13:38:32.1123338Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1123537Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1123895Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1124009Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1124220Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1124385Z [rank1]:E1204 13:19:20.786000 395551 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1124435Z dist init r=1, world=4 2025-12-04T13:38:32.1124772Z [rank2]:[W1204 13:19:20.889798945 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1125105Z [rank0]:[W1204 13:19:20.912058172 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1125433Z [rank3]:[W1204 13:19:20.932379265 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1125762Z [rank1]:[W1204 13:19:21.060325642 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1125804Z FAILED [22.7375s] [100%] 2025-12-04T13:38:32.1125806Z 2025-12-04T13:38:32.1125877Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1125977Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___ 2025-12-04T13:38:32.1126025Z Traceback (most recent call last): 2025-12-04T13:38:32.1126190Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1126236Z self._join_processes(fn) 2025-12-04T13:38:32.1126410Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1126468Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1126651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1126694Z raise RuntimeError(error) 2025-12-04T13:38:32.1126787Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1126834Z Traceback (most recent call last): 2025-12-04T13:38:32.1126999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1127041Z getattr(self, test_name)() 2025-12-04T13:38:32.1127205Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1127240Z fn() 2025-12-04T13:38:32.1127396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1127449Z method(*args, **kwargs) 2025-12-04T13:38:32.1127605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1127646Z method(*args, **kwargs) 2025-12-04T13:38:32.1127800Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1127838Z with policy(): 2025-12-04T13:38:32.1127994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1128035Z raise RuntimeError(msg) 2025-12-04T13:38:32.1128392Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:38:32.1128406Z 2025-12-04T13:38:32.1128482Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1128718Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1128721Z 2025-12-04T13:38:32.1128812Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1128814Z 2025-12-04T13:38:32.1128816Z 2025-12-04T13:38:32.1128893Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1128985Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1129221Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-81f9b84a4e1fc9ed.xml - 2025-12-04T13:38:32.1129286Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1129537Z FAILED [22.7375s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1129623Z Traceback (most recent call last): 2025-12-04T13:38:32.1129807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1129852Z getattr(self, test_name)() 2025-12-04T13:38:32.1130015Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1130055Z fn() 2025-12-04T13:38:32.1130210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1130250Z method(*args, **kwargs) 2025-12-04T13:38:32.1130406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1130446Z method(*args, **kwargs) 2025-12-04T13:38:32.1130600Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1130649Z with policy(): 2025-12-04T13:38:32.1130807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1130847Z raise RuntimeError(msg) 2025-12-04T13:38:32.1131204Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712. 2025-12-04T13:38:32.1131206Z 2025-12-04T13:38:32.1131281Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1131532Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1131534Z 2025-12-04T13:38:32.1131620Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1131689Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1131753Z ====================== 1 failed, 32 deselected in 22.90s ======================= 2025-12-04T13:38:32.1131793Z Got exit code 1 2025-12-04T13:38:32.1131977Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda 2025-12-04T13:38:32.1132105Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.1132296Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-82617ca858ac6daf.xml 2025-12-04T13:38:32.1132368Z ============================= test session starts ============================== 2025-12-04T13:38:32.1132485Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1132527Z cachedir: .pytest_cache 2025-12-04T13:38:32.1132688Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1132735Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1132778Z configfile: pytest.ini 2025-12-04T13:38:32.1132944Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1133020Z collecting ... collected 60 items / 14 deselected / 46 selected 2025-12-04T13:38:32.1133073Z stepcurrent: skipping 14 already run items. 2025-12-04T13:38:32.1133120Z Running 19 items in this shard 2025-12-04T13:38:32.1133121Z 2025-12-04T13:38:32.1133433Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:19:37.497000 396747 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 396816 2025-12-04T13:38:32.1133593Z I1204 13:19:37.498000 396747 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 396817 2025-12-04T13:38:32.1133758Z I1204 13:19:37.499000 396747 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 396818 2025-12-04T13:38:32.1133912Z I1204 13:19:37.499000 396747 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 396819 2025-12-04T13:38:32.1134499Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1134539Z _warn_cpu_init() 2025-12-04T13:38:32.1135044Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1135107Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1135689Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1135742Z _warn_cpu_init() 2025-12-04T13:38:32.1136234Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1136298Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1136870Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1136922Z _warn_cpu_init() 2025-12-04T13:38:32.1137413Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1137471Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1138044Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1138084Z _warn_cpu_init() 2025-12-04T13:38:32.1138395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1138484Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1138975Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1139036Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1139339Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1139426Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1139941Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1140020Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1140307Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1140387Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1140678Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1140756Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1141043Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1141134Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1141422Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1141497Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1141991Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1142052Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1142344Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1142392Z return func(*args, **kwargs) 2025-12-04T13:38:32.1142690Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1142772Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1143262Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1143324Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1143626Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1143701Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1143932Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1143974Z return func(*args, **kwargs) 2025-12-04T13:38:32.1144200Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1144242Z return func(*args, **kwargs) 2025-12-04T13:38:32.1144480Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1144521Z return func(*args, **kwargs) 2025-12-04T13:38:32.1144745Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1144786Z return func(*args, **kwargs) 2025-12-04T13:38:32.1145008Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1145049Z return func(*args, **kwargs) 2025-12-04T13:38:32.1145272Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1145315Z return func(*args, **kwargs) 2025-12-04T13:38:32.1145546Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1145589Z return func(*args, **kwargs) 2025-12-04T13:38:32.1145811Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1145856Z return func(*args, **kwargs) 2025-12-04T13:38:32.1146003Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1146169Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1146463Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1146627Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1146928Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1147056Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1147337Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1147489Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1147771Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1147929Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1148208Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1148344Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1148626Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1148790Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1149275Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1149394Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1149630Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1150009Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1150124Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1150342Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1150508Z [rank2]:E1204 13:19:46.095000 396818 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1150546Z dist init r=2, world=4 2025-12-04T13:38:32.1150686Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1150846Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1151137Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1151306Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1151595Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1151719Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1151999Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1152162Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1152439Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1152587Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1152862Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1153015Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1153292Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1153444Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1153931Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1154066Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1154264Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1154622Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1154738Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1154949Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1155118Z [rank0]:E1204 13:19:46.105000 396816 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1155160Z dist init r=0, world=4 2025-12-04T13:38:32.1155296Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1155469Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1155757Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1155913Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1156199Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1156327Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1156615Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1156766Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1157045Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1157203Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1157483Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1157620Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1157900Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1158047Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1158527Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1158655Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1158851Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1159208Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1159322Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1159538Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1159742Z [rank1]:E1204 13:19:46.174000 396817 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1159794Z dist init r=1, world=4 2025-12-04T13:38:32.1159934Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1160092Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1160381Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1160535Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1160840Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1160963Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1161244Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1161394Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1161684Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1161833Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1162113Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1162251Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1162528Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1162693Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1163172Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1163286Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1163482Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1163836Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1163954Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1164174Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1164340Z [rank3]:E1204 13:19:46.176000 396819 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1164378Z dist init r=3, world=4 2025-12-04T13:38:32.1164715Z [rank0]:[W1204 13:19:46.302651884 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1165060Z [rank2]:[W1204 13:19:46.303122638 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1165389Z [rank1]:[W1204 13:19:46.418872470 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1165716Z [rank3]:[W1204 13:19:46.544867811 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1165767Z FAILED [22.8381s] [ 5%] 2025-12-04T13:38:32.1165770Z 2025-12-04T13:38:32.1165830Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1165930Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___ 2025-12-04T13:38:32.1165980Z Traceback (most recent call last): 2025-12-04T13:38:32.1166150Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1166194Z self._join_processes(fn) 2025-12-04T13:38:32.1166371Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1166425Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1166605Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1166649Z raise RuntimeError(error) 2025-12-04T13:38:32.1166743Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1166788Z Traceback (most recent call last): 2025-12-04T13:38:32.1166953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1166996Z getattr(self, test_name)() 2025-12-04T13:38:32.1167158Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1167192Z fn() 2025-12-04T13:38:32.1167347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1167387Z method(*args, **kwargs) 2025-12-04T13:38:32.1167542Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1167583Z method(*args, **kwargs) 2025-12-04T13:38:32.1167739Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1167776Z with policy(): 2025-12-04T13:38:32.1167933Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1167975Z raise RuntimeError(msg) 2025-12-04T13:38:32.1168340Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1168343Z 2025-12-04T13:38:32.1168422Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1168655Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1168659Z 2025-12-04T13:38:32.1168748Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1168750Z 2025-12-04T13:38:32.1168752Z 2025-12-04T13:38:32.1168825Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1168925Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1169161Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-82617ca858ac6daf.xml - 2025-12-04T13:38:32.1169224Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1169470Z FAILED [22.8381s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1169535Z Traceback (most recent call last): 2025-12-04T13:38:32.1169741Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1169785Z getattr(self, test_name)() 2025-12-04T13:38:32.1169951Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1169986Z fn() 2025-12-04T13:38:32.1170142Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1170182Z method(*args, **kwargs) 2025-12-04T13:38:32.1170336Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1170375Z method(*args, **kwargs) 2025-12-04T13:38:32.1170527Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1170565Z with policy(): 2025-12-04T13:38:32.1170734Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1170774Z raise RuntimeError(msg) 2025-12-04T13:38:32.1171130Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1171133Z 2025-12-04T13:38:32.1171207Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1171441Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1171444Z 2025-12-04T13:38:32.1171533Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1171599Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1171664Z ====================== 1 failed, 14 deselected in 23.00s ======================= 2025-12-04T13:38:32.1171703Z Got exit code 1 2025-12-04T13:38:32.1171746Z Retrying single test... 2025-12-04T13:38:32.1171949Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-76a7782323b97443.xml 2025-12-04T13:38:32.1172011Z ============================= test session starts ============================== 2025-12-04T13:38:32.1172126Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1172170Z cachedir: .pytest_cache 2025-12-04T13:38:32.1172329Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1172380Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1172422Z configfile: pytest.ini 2025-12-04T13:38:32.1172589Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1172664Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1172902Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1172946Z Running 1 items in this shard 2025-12-04T13:38:32.1172947Z 2025-12-04T13:38:32.1173253Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:20:02.914000 398157 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 398226 2025-12-04T13:38:32.1173412Z I1204 13:20:02.915000 398157 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 398227 2025-12-04T13:38:32.1173580Z I1204 13:20:02.915000 398157 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 398228 2025-12-04T13:38:32.1173733Z I1204 13:20:02.916000 398157 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 398229 2025-12-04T13:38:32.1174314Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1174354Z _warn_cpu_init() 2025-12-04T13:38:32.1174850Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1174927Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1175505Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1175541Z _warn_cpu_init() 2025-12-04T13:38:32.1176120Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1176158Z _warn_cpu_init() 2025-12-04T13:38:32.1176662Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1176725Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1177211Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1177284Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1177852Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1177891Z _warn_cpu_init() 2025-12-04T13:38:32.1178394Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1178452Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1178745Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1178828Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1179317Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1179391Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1179719Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1179799Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1180085Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1180167Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1180456Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1180538Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1181043Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1181104Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1181394Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1181471Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1181771Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1181846Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1182134Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1182212Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1182702Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1182778Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1183069Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1183114Z return func(*args, **kwargs) 2025-12-04T13:38:32.1183402Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1183478Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1183723Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1183767Z return func(*args, **kwargs) 2025-12-04T13:38:32.1183992Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1184034Z return func(*args, **kwargs) 2025-12-04T13:38:32.1184256Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1184298Z return func(*args, **kwargs) 2025-12-04T13:38:32.1184519Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1184563Z return func(*args, **kwargs) 2025-12-04T13:38:32.1184784Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1184824Z return func(*args, **kwargs) 2025-12-04T13:38:32.1185057Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1185097Z return func(*args, **kwargs) 2025-12-04T13:38:32.1185319Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1185358Z return func(*args, **kwargs) 2025-12-04T13:38:32.1185578Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1185619Z return func(*args, **kwargs) 2025-12-04T13:38:32.1185766Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1185946Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1186241Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1186397Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1186687Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1186829Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1187111Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1187262Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1187540Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1187690Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1187976Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1188115Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1188394Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1188542Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1189024Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1189142Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1189353Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1189752Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1189866Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1190081Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1190266Z [rank3]:E1204 13:20:11.623000 398229 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1190308Z dist init r=3, world=4 2025-12-04T13:38:32.1190446Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1190608Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1190894Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1191068Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1191355Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1191483Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1191765Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1191915Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1192197Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1192365Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1192643Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1192778Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1193057Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1193208Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1193699Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1193817Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1194014Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1194371Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1194486Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1194711Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1194876Z [rank1]:E1204 13:20:11.666000 398227 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1194914Z dist init r=1, world=4 2025-12-04T13:38:32.1195052Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1195211Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1195512Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1195666Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1195954Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1196080Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1196358Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1196518Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1196795Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1196944Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1197219Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1197357Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1197638Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1197786Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1198279Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1198395Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1198594Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1198963Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1199079Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1199291Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1199456Z [rank0]:E1204 13:20:11.680000 398226 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1199507Z dist init r=0, world=4 2025-12-04T13:38:32.1199673Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1199836Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1200127Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1200283Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1200570Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1200711Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1200994Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1201143Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1201423Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1201570Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1201850Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1201988Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1202291Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1202441Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1202915Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1203032Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1203242Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1203600Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1203712Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1203940Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1204105Z [rank2]:E1204 13:20:11.697000 398228 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1204144Z dist init r=2, world=4 2025-12-04T13:38:32.1204479Z [rank3]:[W1204 13:20:11.817392678 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1204806Z [rank1]:[W1204 13:20:11.930174227 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1205135Z [rank0]:[W1204 13:20:12.037192689 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1205474Z [rank2]:[W1204 13:20:12.087850407 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1205516Z FAILED [23.0372s] [100%] 2025-12-04T13:38:32.1205518Z 2025-12-04T13:38:32.1205576Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1205676Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___ 2025-12-04T13:38:32.1205724Z Traceback (most recent call last): 2025-12-04T13:38:32.1205889Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1205936Z self._join_processes(fn) 2025-12-04T13:38:32.1206111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1206166Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1206356Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1206401Z raise RuntimeError(error) 2025-12-04T13:38:32.1206481Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1206527Z Traceback (most recent call last): 2025-12-04T13:38:32.1206688Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1206732Z getattr(self, test_name)() 2025-12-04T13:38:32.1206891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1206929Z fn() 2025-12-04T13:38:32.1207080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1207122Z method(*args, **kwargs) 2025-12-04T13:38:32.1207286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1207328Z method(*args, **kwargs) 2025-12-04T13:38:32.1207479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1207518Z with policy(): 2025-12-04T13:38:32.1207673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1207717Z raise RuntimeError(msg) 2025-12-04T13:38:32.1208085Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1208089Z 2025-12-04T13:38:32.1208165Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1208400Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1208402Z 2025-12-04T13:38:32.1208489Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1208491Z 2025-12-04T13:38:32.1208493Z 2025-12-04T13:38:32.1208569Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1208657Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1208903Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-76a7782323b97443.xml - 2025-12-04T13:38:32.1208965Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1209215Z FAILED [23.0372s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1209262Z Traceback (most recent call last): 2025-12-04T13:38:32.1209428Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1209472Z getattr(self, test_name)() 2025-12-04T13:38:32.1209657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1209695Z fn() 2025-12-04T13:38:32.1209850Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1209892Z method(*args, **kwargs) 2025-12-04T13:38:32.1210045Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1210087Z method(*args, **kwargs) 2025-12-04T13:38:32.1210252Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1210291Z with policy(): 2025-12-04T13:38:32.1210443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1210486Z raise RuntimeError(msg) 2025-12-04T13:38:32.1210837Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1210844Z 2025-12-04T13:38:32.1210919Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1211164Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1211167Z 2025-12-04T13:38:32.1211254Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1211319Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1211382Z ====================== 1 failed, 32 deselected in 23.20s ======================= 2025-12-04T13:38:32.1211420Z Got exit code 1 2025-12-04T13:38:32.1211460Z Retrying single test... 2025-12-04T13:38:32.1211651Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-729675f2485b5e8a.xml 2025-12-04T13:38:32.1211724Z ============================= test session starts ============================== 2025-12-04T13:38:32.1211839Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1211879Z cachedir: .pytest_cache 2025-12-04T13:38:32.1212040Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1212086Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1212128Z configfile: pytest.ini 2025-12-04T13:38:32.1212293Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1212369Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1212593Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1212652Z Running 1 items in this shard 2025-12-04T13:38:32.1212655Z 2025-12-04T13:38:32.1212961Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:20:28.531000 399567 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 399636 2025-12-04T13:38:32.1213117Z I1204 13:20:28.531000 399567 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 399637 2025-12-04T13:38:32.1213271Z I1204 13:20:28.532000 399567 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 399638 2025-12-04T13:38:32.1213423Z I1204 13:20:28.532000 399567 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 399639 2025-12-04T13:38:32.1214174Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1214214Z _warn_cpu_init() 2025-12-04T13:38:32.1214721Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1214788Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1215374Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1215414Z _warn_cpu_init() 2025-12-04T13:38:32.1215903Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1215966Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1216552Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1216590Z _warn_cpu_init() 2025-12-04T13:38:32.1217084Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1217142Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1217716Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1217773Z _warn_cpu_init() 2025-12-04T13:38:32.1218062Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1218147Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1218432Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1218515Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1219017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1219077Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1219364Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1219443Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1219985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1220043Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1220333Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1220410Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1220699Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1220790Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1221278Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1221338Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1221625Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1221701Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1222005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1222051Z return func(*args, **kwargs) 2025-12-04T13:38:32.1222341Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1222423Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1222913Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1222972Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1223260Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1223347Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1223578Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1223620Z return func(*args, **kwargs) 2025-12-04T13:38:32.1223847Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1223889Z return func(*args, **kwargs) 2025-12-04T13:38:32.1224113Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1224166Z return func(*args, **kwargs) 2025-12-04T13:38:32.1224389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1224431Z return func(*args, **kwargs) 2025-12-04T13:38:32.1224653Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1224695Z return func(*args, **kwargs) 2025-12-04T13:38:32.1224914Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1224969Z return func(*args, **kwargs) 2025-12-04T13:38:32.1225191Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1225232Z return func(*args, **kwargs) 2025-12-04T13:38:32.1225453Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1225494Z return func(*args, **kwargs) 2025-12-04T13:38:32.1225642Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1225807Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1226108Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1226266Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1226554Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1226678Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1226972Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1227123Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1227401Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1227558Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1227837Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1227976Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1228255Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1228417Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1228897Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1229014Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1229222Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1229741Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1229861Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1230073Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1230239Z [rank1]:E1204 13:20:37.229000 399637 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1230279Z dist init r=1, world=4 2025-12-04T13:38:32.1230434Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1230594Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1230888Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1231043Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1231327Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1231455Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1231736Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1231899Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1232177Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1232326Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1232603Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1232743Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1233036Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1233184Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1233661Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1233795Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1233996Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1234355Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1234467Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1234682Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1234857Z [rank0]:E1204 13:20:37.289000 399636 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1234897Z dist init r=0, world=4 2025-12-04T13:38:32.1235034Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1235197Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1235483Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1235637Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1235924Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1236048Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1236340Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1236487Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1236767Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1236916Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1237204Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1237344Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1237620Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1237769Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1238260Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1238377Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1238573Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1238935Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1239061Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1239271Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1239437Z [rank3]:E1204 13:20:37.302000 399639 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1239476Z dist init r=3, world=4 2025-12-04T13:38:32.1239658Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1239817Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1240104Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1240259Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1240558Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1240684Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1240961Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1241111Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1241400Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1241550Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1241826Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1241963Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1242256Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1242404Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1242880Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1242994Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1243191Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1243567Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1243684Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1243897Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1244059Z [rank2]:E1204 13:20:37.313000 399638 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1244099Z dist init r=2, world=4 2025-12-04T13:38:32.1244435Z [rank1]:[W1204 13:20:37.400194084 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1244777Z [rank0]:[W1204 13:20:37.619990936 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1245105Z [rank2]:[W1204 13:20:37.666999629 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1245432Z [rank3]:[W1204 13:20:37.694596839 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1245476Z FAILED [22.9376s] [100%] 2025-12-04T13:38:32.1245478Z 2025-12-04T13:38:32.1245534Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1245649Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___ 2025-12-04T13:38:32.1245695Z Traceback (most recent call last): 2025-12-04T13:38:32.1245865Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1245908Z self._join_processes(fn) 2025-12-04T13:38:32.1246084Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1246138Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1246319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1246375Z raise RuntimeError(error) 2025-12-04T13:38:32.1246457Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1246501Z Traceback (most recent call last): 2025-12-04T13:38:32.1246665Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1246708Z getattr(self, test_name)() 2025-12-04T13:38:32.1246871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1246905Z fn() 2025-12-04T13:38:32.1247059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1247100Z method(*args, **kwargs) 2025-12-04T13:38:32.1247254Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1247308Z method(*args, **kwargs) 2025-12-04T13:38:32.1247460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1247498Z with policy(): 2025-12-04T13:38:32.1247652Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1247696Z raise RuntimeError(msg) 2025-12-04T13:38:32.1248049Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1248051Z 2025-12-04T13:38:32.1248128Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1248364Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1248367Z 2025-12-04T13:38:32.1248455Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1248458Z 2025-12-04T13:38:32.1248459Z 2025-12-04T13:38:32.1248534Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1248634Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1248866Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-729675f2485b5e8a.xml - 2025-12-04T13:38:32.1248927Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1249174Z FAILED [22.9376s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1249220Z Traceback (most recent call last): 2025-12-04T13:38:32.1249387Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1249429Z getattr(self, test_name)() 2025-12-04T13:38:32.1249643Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1249678Z fn() 2025-12-04T13:38:32.1249831Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1249871Z method(*args, **kwargs) 2025-12-04T13:38:32.1250023Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1250062Z method(*args, **kwargs) 2025-12-04T13:38:32.1250229Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1250266Z with policy(): 2025-12-04T13:38:32.1250422Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1250463Z raise RuntimeError(msg) 2025-12-04T13:38:32.1250821Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1250823Z 2025-12-04T13:38:32.1250899Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1251130Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1251146Z 2025-12-04T13:38:32.1251234Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1251297Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1251362Z ====================== 1 failed, 32 deselected in 23.10s ======================= 2025-12-04T13:38:32.1251401Z Got exit code 1 2025-12-04T13:38:32.1251584Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda 2025-12-04T13:38:32.1251711Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.1251902Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5eacd8724e83b056.xml 2025-12-04T13:38:32.1251960Z ============================= test session starts ============================== 2025-12-04T13:38:32.1252076Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1252119Z cachedir: .pytest_cache 2025-12-04T13:38:32.1252280Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1252325Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1252367Z configfile: pytest.ini 2025-12-04T13:38:32.1252545Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1252621Z collecting ... collected 60 items / 15 deselected / 45 selected 2025-12-04T13:38:32.1252676Z stepcurrent: skipping 15 already run items. 2025-12-04T13:38:32.1252720Z Running 18 items in this shard 2025-12-04T13:38:32.1252722Z 2025-12-04T13:38:32.1253025Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda I1204 13:20:54.200000 400977 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 401046 2025-12-04T13:38:32.1253182Z I1204 13:20:54.200000 400977 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 401047 2025-12-04T13:38:32.1253346Z I1204 13:20:54.201000 400977 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 401048 2025-12-04T13:38:32.1253498Z I1204 13:20:54.202000 400977 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 401049 2025-12-04T13:38:32.1254080Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1254130Z _warn_cpu_init() 2025-12-04T13:38:32.1254701Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1254739Z _warn_cpu_init() 2025-12-04T13:38:32.1255038Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1255077Z _init_core_state( 2025-12-04T13:38:32.1255569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1255644Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1255941Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1255977Z _init_core_state( 2025-12-04T13:38:32.1256468Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1256529Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1257115Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1257152Z _warn_cpu_init() 2025-12-04T13:38:32.1257450Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1257489Z _init_core_state( 2025-12-04T13:38:32.1257988Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1258049Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1258617Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1258667Z _warn_cpu_init() 2025-12-04T13:38:32.1259157Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1259214Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1259508Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1259545Z _init_core_state( 2025-12-04T13:38:32.1260073Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1260148Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1260439Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1260481Z return func(*args, **kwargs) 2025-12-04T13:38:32.1260969Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1261029Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1261526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1261585Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1261813Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1261858Z return func(*args, **kwargs) 2025-12-04T13:38:32.1262083Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1262126Z return func(*args, **kwargs) 2025-12-04T13:38:32.1262364Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1262405Z return func(*args, **kwargs) 2025-12-04T13:38:32.1262629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1262669Z return func(*args, **kwargs) 2025-12-04T13:38:32.1262889Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1262942Z return func(*args, **kwargs) 2025-12-04T13:38:32.1263163Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1263202Z return func(*args, **kwargs) 2025-12-04T13:38:32.1263424Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1263464Z return func(*args, **kwargs) 2025-12-04T13:38:32.1263686Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1263726Z return func(*args, **kwargs) 2025-12-04T13:38:32.1263873Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1264036Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1264342Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1264501Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1264788Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1264915Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1265191Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1265344Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1265637Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1265786Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1266062Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1266200Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1266489Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1266639Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1267118Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1267235Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1267442Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1267799Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1267914Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1268129Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1268293Z [rank2]:E1204 13:21:02.954000 401048 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1268345Z dist init r=2, world=4 2025-12-04T13:38:32.1268482Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1268643Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1268933Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1269090Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1269383Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1269510Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1269831Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1269990Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1270268Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1270414Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1270691Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1270841Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1271119Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1271269Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1271741Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1271870Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1272066Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1272417Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1272532Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1272747Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1272927Z [rank0]:E1204 13:21:02.963000 401046 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1272965Z dist init r=0, world=4 2025-12-04T13:38:32.1273106Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1273265Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1273551Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1273705Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1273993Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1274128Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1274405Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1274554Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1274833Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1274983Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1275269Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1275407Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1275685Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1275843Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1276320Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1276434Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1276633Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1276985Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1277109Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1277323Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1277486Z [rank1]:E1204 13:21:02.964000 401047 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1277525Z dist init r=1, world=4 2025-12-04T13:38:32.1277662Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1277824Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1278110Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1278264Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1278561Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1278686Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1278963Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1279111Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1279402Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1279549Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1279859Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1279995Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1280287Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1280438Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1280911Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1281026Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1281223Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1281594Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1281708Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1281918Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1282083Z [rank3]:E1204 13:21:02.966000 401049 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1282122Z dist init r=3, world=4 2025-12-04T13:38:32.1282459Z [rank1]:[W1204 13:21:03.225081823 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1282801Z [rank2]:[W1204 13:21:03.237153496 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1283134Z [rank3]:[W1204 13:21:03.259860422 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1283465Z [rank0]:[W1204 13:21:03.263817071 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1283506Z FAILED [23.1379s] [ 5%] 2025-12-04T13:38:32.1283509Z 2025-12-04T13:38:32.1283582Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1283682Z _____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda _____ 2025-12-04T13:38:32.1283730Z Traceback (most recent call last): 2025-12-04T13:38:32.1283894Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1283939Z self._join_processes(fn) 2025-12-04T13:38:32.1284112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1284167Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1284357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1284405Z raise RuntimeError(error) 2025-12-04T13:38:32.1284483Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1284531Z Traceback (most recent call last): 2025-12-04T13:38:32.1284693Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1284737Z getattr(self, test_name)() 2025-12-04T13:38:32.1284896Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1284933Z fn() 2025-12-04T13:38:32.1285085Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1285127Z method(*args, **kwargs) 2025-12-04T13:38:32.1285283Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1285333Z method(*args, **kwargs) 2025-12-04T13:38:32.1285486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1285523Z with policy(): 2025-12-04T13:38:32.1285680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1285720Z raise RuntimeError(msg) 2025-12-04T13:38:32.1286070Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1286073Z 2025-12-04T13:38:32.1286148Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1286377Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1286380Z 2025-12-04T13:38:32.1286467Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1286470Z 2025-12-04T13:38:32.1286474Z 2025-12-04T13:38:32.1286560Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1286651Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1286885Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5eacd8724e83b056.xml - 2025-12-04T13:38:32.1286947Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1287187Z FAILED [23.1379s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1287235Z Traceback (most recent call last): 2025-12-04T13:38:32.1287401Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1287455Z getattr(self, test_name)() 2025-12-04T13:38:32.1287618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1287655Z fn() 2025-12-04T13:38:32.1287807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1287848Z method(*args, **kwargs) 2025-12-04T13:38:32.1288001Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1288054Z method(*args, **kwargs) 2025-12-04T13:38:32.1288206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1288245Z with policy(): 2025-12-04T13:38:32.1288399Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1288442Z raise RuntimeError(msg) 2025-12-04T13:38:32.1288790Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1288794Z 2025-12-04T13:38:32.1288870Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1289097Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1289111Z 2025-12-04T13:38:32.1289197Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1289262Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1289325Z ====================== 1 failed, 15 deselected in 23.30s ======================= 2025-12-04T13:38:32.1289363Z Got exit code 1 2025-12-04T13:38:32.1289404Z Retrying single test... 2025-12-04T13:38:32.1289625Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3dfb0e9aaec2e3f1.xml 2025-12-04T13:38:32.1289683Z ============================= test session starts ============================== 2025-12-04T13:38:32.1289799Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1289839Z cachedir: .pytest_cache 2025-12-04T13:38:32.1290000Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1290047Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1290090Z configfile: pytest.ini 2025-12-04T13:38:32.1290254Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1290328Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1290564Z stepcurrent: skipping 15 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1290610Z Running 1 items in this shard 2025-12-04T13:38:32.1290612Z 2025-12-04T13:38:32.1290911Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda I1204 13:21:19.868000 402387 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 402456 2025-12-04T13:38:32.1291067Z I1204 13:21:19.869000 402387 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 402457 2025-12-04T13:38:32.1291221Z I1204 13:21:19.869000 402387 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 402458 2025-12-04T13:38:32.1291385Z I1204 13:21:19.870000 402387 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 402459 2025-12-04T13:38:32.1291968Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1292020Z _warn_cpu_init() 2025-12-04T13:38:32.1292318Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1292357Z _init_core_state( 2025-12-04T13:38:32.1292855Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1292918Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1293489Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1293542Z _warn_cpu_init() 2025-12-04T13:38:32.1293839Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1293879Z _init_core_state( 2025-12-04T13:38:32.1294372Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1294435Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1295020Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1295058Z _warn_cpu_init() 2025-12-04T13:38:32.1295358Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1295399Z _init_core_state( 2025-12-04T13:38:32.1295901Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1295965Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1296532Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1296595Z _warn_cpu_init() 2025-12-04T13:38:32.1297087Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1297146Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1297635Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1297693Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1298005Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1298043Z _init_core_state( 2025-12-04T13:38:32.1298533Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1298594Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1298884Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1298932Z return func(*args, **kwargs) 2025-12-04T13:38:32.1299430Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1299492Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1299751Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1299797Z return func(*args, **kwargs) 2025-12-04T13:38:32.1300023Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1300072Z return func(*args, **kwargs) 2025-12-04T13:38:32.1300296Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1300354Z return func(*args, **kwargs) 2025-12-04T13:38:32.1300581Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1300623Z return func(*args, **kwargs) 2025-12-04T13:38:32.1300845Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1300886Z return func(*args, **kwargs) 2025-12-04T13:38:32.1301109Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1301164Z return func(*args, **kwargs) 2025-12-04T13:38:32.1301388Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1301429Z return func(*args, **kwargs) 2025-12-04T13:38:32.1301653Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1301694Z return func(*args, **kwargs) 2025-12-04T13:38:32.1301843Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1302008Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1302315Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1302479Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1302770Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1302901Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1303181Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1303337Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1303630Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1303783Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1304061Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1304198Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1304481Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1304639Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1305120Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1305236Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1305447Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1305803Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1305919Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1306135Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1306301Z [rank2]:E1204 13:21:28.559000 402458 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1306346Z dist init r=2, world=4 2025-12-04T13:38:32.1306495Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1306658Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1306946Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1307104Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1307392Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1307518Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1307797Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1307957Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1308235Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1308382Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1308667Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1308808Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1309097Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1309249Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1309759Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1309893Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1310089Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1310444Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1310563Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1310776Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1310958Z [rank3]:E1204 13:21:28.601000 402459 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1310998Z dist init r=3, world=4 2025-12-04T13:38:32.1311141Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1311300Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1311590Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1311744Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1312035Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1312165Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1312454Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1312605Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1312880Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1313033Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1313328Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1313469Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1313752Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1313902Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1314388Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1314504Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1314703Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1315057Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1315181Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1315400Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1315564Z [rank0]:E1204 13:21:28.601000 402456 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1315607Z dist init r=0, world=4 2025-12-04T13:38:32.1315745Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1315909Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1316196Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1316353Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1316663Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1316788Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1317069Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1317218Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1317510Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1317658Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1317936Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1318072Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1318363Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1318516Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1318990Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1319106Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1319301Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1319708Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1319828Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1320039Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1320206Z [rank1]:E1204 13:21:28.612000 402457 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1320244Z dist init r=1, world=4 2025-12-04T13:38:32.1320582Z [rank2]:[W1204 13:21:28.725525673 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1320928Z [rank3]:[W1204 13:21:28.887556774 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1321260Z [rank1]:[W1204 13:21:28.894284017 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1321591Z [rank0]:[W1204 13:21:28.927860742 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1321634Z FAILED [22.8393s] [100%] 2025-12-04T13:38:32.1321637Z 2025-12-04T13:38:32.1321696Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1321808Z _____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda _____ 2025-12-04T13:38:32.1321858Z Traceback (most recent call last): 2025-12-04T13:38:32.1322025Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1322072Z self._join_processes(fn) 2025-12-04T13:38:32.1322247Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1322304Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1322482Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1322571Z raise RuntimeError(error) 2025-12-04T13:38:32.1322650Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1322699Z Traceback (most recent call last): 2025-12-04T13:38:32.1322861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1322908Z getattr(self, test_name)() 2025-12-04T13:38:32.1323067Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1323106Z fn() 2025-12-04T13:38:32.1323258Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1323303Z method(*args, **kwargs) 2025-12-04T13:38:32.1323459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1323515Z method(*args, **kwargs) 2025-12-04T13:38:32.1323672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1323709Z with policy(): 2025-12-04T13:38:32.1323868Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1323910Z raise RuntimeError(msg) 2025-12-04T13:38:32.1324261Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1324264Z 2025-12-04T13:38:32.1324340Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1324571Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1324575Z 2025-12-04T13:38:32.1324662Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1324664Z 2025-12-04T13:38:32.1324669Z 2025-12-04T13:38:32.1324746Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1324849Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1325087Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3dfb0e9aaec2e3f1.xml - 2025-12-04T13:38:32.1325152Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1325394Z FAILED [22.8393s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1325445Z Traceback (most recent call last): 2025-12-04T13:38:32.1325613Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1325658Z getattr(self, test_name)() 2025-12-04T13:38:32.1325833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1325872Z fn() 2025-12-04T13:38:32.1326026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1326071Z method(*args, **kwargs) 2025-12-04T13:38:32.1326224Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1326267Z method(*args, **kwargs) 2025-12-04T13:38:32.1326431Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1326472Z with policy(): 2025-12-04T13:38:32.1326625Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1326670Z raise RuntimeError(msg) 2025-12-04T13:38:32.1327019Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1327024Z 2025-12-04T13:38:32.1327099Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1327329Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1327333Z 2025-12-04T13:38:32.1327430Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1327496Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1327559Z ====================== 1 failed, 32 deselected in 23.00s ======================= 2025-12-04T13:38:32.1327602Z Got exit code 1 2025-12-04T13:38:32.1327642Z Retrying single test... 2025-12-04T13:38:32.1327836Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-111f1001e4d61fdc.xml 2025-12-04T13:38:32.1327894Z ============================= test session starts ============================== 2025-12-04T13:38:32.1328011Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1328053Z cachedir: .pytest_cache 2025-12-04T13:38:32.1328218Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1328266Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1328309Z configfile: pytest.ini 2025-12-04T13:38:32.1328474Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1328553Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1328790Z stepcurrent: skipping 15 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1328838Z Running 1 items in this shard 2025-12-04T13:38:32.1328841Z 2025-12-04T13:38:32.1329144Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda I1204 13:21:45.461000 403797 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 403866 2025-12-04T13:38:32.1329299Z I1204 13:21:45.462000 403797 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 403867 2025-12-04T13:38:32.1329456Z I1204 13:21:45.462000 403797 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 403868 2025-12-04T13:38:32.1329667Z I1204 13:21:45.463000 403797 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 403869 2025-12-04T13:38:32.1330251Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1330290Z _warn_cpu_init() 2025-12-04T13:38:32.1330593Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1330648Z _init_core_state( 2025-12-04T13:38:32.1331145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1331213Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1331789Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1331845Z _warn_cpu_init() 2025-12-04T13:38:32.1332147Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1332185Z _init_core_state( 2025-12-04T13:38:32.1332681Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1332743Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1333335Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1333373Z _warn_cpu_init() 2025-12-04T13:38:32.1333669Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1333710Z _init_core_state( 2025-12-04T13:38:32.1334199Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1334273Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1334846Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1334886Z _warn_cpu_init() 2025-12-04T13:38:32.1335377Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1335448Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1335747Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1335785Z _init_core_state( 2025-12-04T13:38:32.1336275Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1336344Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1336832Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1336892Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1337184Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1337232Z return func(*args, **kwargs) 2025-12-04T13:38:32.1337721Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1337792Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1338023Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1338070Z return func(*args, **kwargs) 2025-12-04T13:38:32.1338296Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1338342Z return func(*args, **kwargs) 2025-12-04T13:38:32.1338568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1338609Z return func(*args, **kwargs) 2025-12-04T13:38:32.1338845Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1338886Z return func(*args, **kwargs) 2025-12-04T13:38:32.1341337Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1341381Z return func(*args, **kwargs) 2025-12-04T13:38:32.1341613Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1341673Z return func(*args, **kwargs) 2025-12-04T13:38:32.1341897Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1341938Z return func(*args, **kwargs) 2025-12-04T13:38:32.1342162Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1342202Z return func(*args, **kwargs) 2025-12-04T13:38:32.1342352Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1342536Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1342831Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1343011Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1343302Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1343427Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1343708Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1343858Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1344137Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1344301Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1344584Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1344725Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1345004Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1345159Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1345634Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1345842Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1346043Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1346421Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1346540Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1346751Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1346921Z [rank2]:E1204 13:21:54.271000 403868 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1346960Z dist init r=2, world=4 2025-12-04T13:38:32.1347103Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1347274Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1347569Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1347723Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1348012Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1348140Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1348418Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1348568Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1348854Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1349006Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1349284Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1349426Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1349740Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1349891Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1350390Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1350522Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1350721Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1351073Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1351188Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1351402Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1351581Z [rank1]:E1204 13:21:54.331000 403867 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1351625Z dist init r=1, world=4 2025-12-04T13:38:32.1351763Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1351926Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1352219Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1352377Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1352667Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1352794Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1353088Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1353235Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1353516Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1353665Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1353945Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1354081Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1354378Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1354531Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1355014Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1355134Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1355329Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1355687Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1355818Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1356028Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1356197Z [rank3]:E1204 13:21:54.333000 403869 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1356236Z dist init r=3, world=4 2025-12-04T13:38:32.1356376Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1356536Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1356825Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1356980Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1357282Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1357407Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1357691Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1357842Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1358125Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1358282Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1358562Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1358713Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1363526Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1363718Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1364206Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1364330Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1364526Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1364907Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1365023Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1365239Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1365409Z [rank0]:E1204 13:21:54.338000 403866 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1365451Z dist init r=0, world=4 2025-12-04T13:38:32.1365790Z [rank2]:[W1204 13:21:54.443312702 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1366125Z [rank0]:[W1204 13:21:54.563087262 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1366473Z [rank1]:[W1204 13:21:54.609955015 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1366805Z [rank3]:[W1204 13:21:54.614971930 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1366850Z FAILED [23.0354s] [100%] 2025-12-04T13:38:32.1366853Z 2025-12-04T13:38:32.1366913Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1367015Z _____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda _____ 2025-12-04T13:38:32.1367062Z Traceback (most recent call last): 2025-12-04T13:38:32.1367237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1367281Z self._join_processes(fn) 2025-12-04T13:38:32.1367461Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1367544Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1367724Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1367784Z raise RuntimeError(error) 2025-12-04T13:38:32.1367864Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1367912Z Traceback (most recent call last): 2025-12-04T13:38:32.1368078Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1368123Z getattr(self, test_name)() 2025-12-04T13:38:32.1368285Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1368323Z fn() 2025-12-04T13:38:32.1368476Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1368521Z method(*args, **kwargs) 2025-12-04T13:38:32.1368673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1368718Z method(*args, **kwargs) 2025-12-04T13:38:32.1368884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1368924Z with policy(): 2025-12-04T13:38:32.1369079Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1369122Z raise RuntimeError(msg) 2025-12-04T13:38:32.1369479Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1369481Z 2025-12-04T13:38:32.1369562Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1369829Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1369833Z 2025-12-04T13:38:32.1369922Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1369924Z 2025-12-04T13:38:32.1369926Z 2025-12-04T13:38:32.1370005Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1370093Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1370349Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-111f1001e4d61fdc.xml - 2025-12-04T13:38:32.1370411Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1370664Z FAILED [23.0354s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1370711Z Traceback (most recent call last): 2025-12-04T13:38:32.1370884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1370925Z getattr(self, test_name)() 2025-12-04T13:38:32.1371089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1371125Z fn() 2025-12-04T13:38:32.1371284Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1371326Z method(*args, **kwargs) 2025-12-04T13:38:32.1371492Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1371535Z method(*args, **kwargs) 2025-12-04T13:38:32.1371689Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1371744Z with policy(): 2025-12-04T13:38:32.1371897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1371940Z raise RuntimeError(msg) 2025-12-04T13:38:32.1372289Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1372291Z 2025-12-04T13:38:32.1372368Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1372596Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1372598Z 2025-12-04T13:38:32.1372690Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1372767Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1372833Z ====================== 1 failed, 32 deselected in 23.18s ======================= 2025-12-04T13:38:32.1372870Z Got exit code 1 2025-12-04T13:38:32.1373050Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda 2025-12-04T13:38:32.1373182Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.1373372Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-76607ebe359c7ec7.xml 2025-12-04T13:38:32.1373436Z ============================= test session starts ============================== 2025-12-04T13:38:32.1373552Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1373597Z cachedir: .pytest_cache 2025-12-04T13:38:32.1373758Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1373808Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1373850Z configfile: pytest.ini 2025-12-04T13:38:32.1374020Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1374106Z collecting ... collected 60 items / 16 deselected / 44 selected 2025-12-04T13:38:32.1374163Z stepcurrent: skipping 16 already run items. 2025-12-04T13:38:32.1374206Z Running 17 items in this shard 2025-12-04T13:38:32.1374208Z 2025-12-04T13:38:32.1374528Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda I1204 13:22:11.389000 405207 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 405276 2025-12-04T13:38:32.1374685Z I1204 13:22:11.390000 405207 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 405277 2025-12-04T13:38:32.1374841Z I1204 13:22:11.390000 405207 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 405278 2025-12-04T13:38:32.1374993Z I1204 13:22:11.391000 405207 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 405279 2025-12-04T13:38:32.1375601Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1375643Z _warn_cpu_init() 2025-12-04T13:38:32.1375959Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1376000Z _init_core_state( 2025-12-04T13:38:32.1376494Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1376561Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1377142Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1377191Z _warn_cpu_init() 2025-12-04T13:38:32.1377497Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1377535Z _init_core_state( 2025-12-04T13:38:32.1378028Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1378090Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1378684Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1378725Z _warn_cpu_init() 2025-12-04T13:38:32.1379022Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1379062Z _init_core_state( 2025-12-04T13:38:32.1379553Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1379647Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1380239Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1380289Z _warn_cpu_init() 2025-12-04T13:38:32.1380777Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1380836Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1381327Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1381387Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1381684Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1381736Z _init_core_state( 2025-12-04T13:38:32.1382227Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1382287Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1382581Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1382628Z return func(*args, **kwargs) 2025-12-04T13:38:32.1383138Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1383199Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1383433Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1383476Z return func(*args, **kwargs) 2025-12-04T13:38:32.1383703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1383746Z return func(*args, **kwargs) 2025-12-04T13:38:32.1383975Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1384016Z return func(*args, **kwargs) 2025-12-04T13:38:32.1384247Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1384288Z return func(*args, **kwargs) 2025-12-04T13:38:32.1384524Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1384564Z return func(*args, **kwargs) 2025-12-04T13:38:32.1384789Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1384841Z return func(*args, **kwargs) 2025-12-04T13:38:32.1385066Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1385108Z return func(*args, **kwargs) 2025-12-04T13:38:32.1385329Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1385373Z return func(*args, **kwargs) 2025-12-04T13:38:32.1385520Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1385689Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1385999Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1386159Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1386449Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1386577Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1386864Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1387017Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1387297Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1387461Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1387739Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1387881Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1388166Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1388317Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1388815Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1388935Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1389143Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1389516Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1389671Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1389888Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1390057Z [rank2]:E1204 13:22:20.214000 405278 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1390096Z dist init r=2, world=4 2025-12-04T13:38:32.1390252Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1390412Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1390705Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1390860Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1391150Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1391276Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1391560Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1391723Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1392001Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1392149Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1392426Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1392567Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1392847Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1392999Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1393499Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1393630Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1393829Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1394192Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1394307Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1394519Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1394698Z [rank3]:E1204 13:22:20.233000 405279 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1394739Z dist init r=3, world=4 2025-12-04T13:38:32.1394878Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1395039Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1395326Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1395483Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1395771Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1395896Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1396183Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1396335Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1396615Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1396764Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1397047Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1397184Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1397476Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1397624Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1398124Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1398242Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1398438Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1398802Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1398932Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1399147Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1399314Z [rank0]:E1204 13:22:20.243000 405276 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1399355Z dist init r=0, world=4 2025-12-04T13:38:32.1399494Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1399683Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1399973Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1400127Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1400428Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1400551Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1400830Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1400979Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1401258Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1401407Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1401696Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1401834Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1402125Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1402276Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1402759Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1402877Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1403074Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1403447Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1403563Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1403774Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1403940Z [rank1]:E1204 13:22:20.251000 405277 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1403980Z dist init r=1, world=4 2025-12-04T13:38:32.1404320Z [rank2]:[W1204 13:22:20.375098828 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1404663Z [rank3]:[W1204 13:22:20.418451687 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1404991Z [rank1]:[W1204 13:22:20.511303066 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1405319Z [rank0]:[W1204 13:22:20.514677593 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1405361Z FAILED [23.2380s] [ 5%] 2025-12-04T13:38:32.1405363Z 2025-12-04T13:38:32.1405424Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1405526Z _ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.1405576Z Traceback (most recent call last): 2025-12-04T13:38:32.1405743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1405786Z self._join_processes(fn) 2025-12-04T13:38:32.1405973Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1406028Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1406222Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1406267Z raise RuntimeError(error) 2025-12-04T13:38:32.1406348Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1406393Z Traceback (most recent call last): 2025-12-04T13:38:32.1406559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1406600Z getattr(self, test_name)() 2025-12-04T13:38:32.1406762Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1406799Z fn() 2025-12-04T13:38:32.1406954Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1406995Z method(*args, **kwargs) 2025-12-04T13:38:32.1407153Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1407204Z method(*args, **kwargs) 2025-12-04T13:38:32.1407359Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1407398Z with policy(): 2025-12-04T13:38:32.1407555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1407595Z raise RuntimeError(msg) 2025-12-04T13:38:32.1407958Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1407961Z 2025-12-04T13:38:32.1408042Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1408283Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1408286Z 2025-12-04T13:38:32.1408375Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1408377Z 2025-12-04T13:38:32.1408436Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1408493Z Traceback (most recent call last): 2025-12-04T13:38:32.1408657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1408701Z getattr(self, test_name)() 2025-12-04T13:38:32.1408861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1408899Z fn() 2025-12-04T13:38:32.1409049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1409094Z method(*args, **kwargs) 2025-12-04T13:38:32.1409244Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1409288Z method(*args, **kwargs) 2025-12-04T13:38:32.1409438Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1409478Z with policy(): 2025-12-04T13:38:32.1409680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1409725Z raise RuntimeError(msg) 2025-12-04T13:38:32.1410100Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1410116Z 2025-12-04T13:38:32.1410191Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1410429Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1410432Z 2025-12-04T13:38:32.1410519Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1410521Z 2025-12-04T13:38:32.1410582Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1410630Z Traceback (most recent call last): 2025-12-04T13:38:32.1410798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1410841Z getattr(self, test_name)() 2025-12-04T13:38:32.1411005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1411056Z fn() 2025-12-04T13:38:32.1411210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1411250Z method(*args, **kwargs) 2025-12-04T13:38:32.1411403Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1411443Z method(*args, **kwargs) 2025-12-04T13:38:32.1411597Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1411636Z with policy(): 2025-12-04T13:38:32.1411790Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1411834Z raise RuntimeError(msg) 2025-12-04T13:38:32.1412193Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1412197Z 2025-12-04T13:38:32.1412272Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1412522Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1412524Z 2025-12-04T13:38:32.1412615Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1412617Z 2025-12-04T13:38:32.1412619Z 2025-12-04T13:38:32.1412696Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1412785Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1413026Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-76607ebe359c7ec7.xml - 2025-12-04T13:38:32.1413088Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1413346Z FAILED [23.2380s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1413393Z Traceback (most recent call last): 2025-12-04T13:38:32.1413560Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1413602Z getattr(self, test_name)() 2025-12-04T13:38:32.1413781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1413818Z fn() 2025-12-04T13:38:32.1413994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1414036Z method(*args, **kwargs) 2025-12-04T13:38:32.1414191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1414230Z method(*args, **kwargs) 2025-12-04T13:38:32.1414385Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1414421Z with policy(): 2025-12-04T13:38:32.1414574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1414615Z raise RuntimeError(msg) 2025-12-04T13:38:32.1414976Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1414990Z 2025-12-04T13:38:32.1415067Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1415300Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1415302Z 2025-12-04T13:38:32.1415391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1415393Z 2025-12-04T13:38:32.1415452Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1415499Z Traceback (most recent call last): 2025-12-04T13:38:32.1415665Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1415708Z getattr(self, test_name)() 2025-12-04T13:38:32.1415870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1415908Z fn() 2025-12-04T13:38:32.1416059Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1416102Z method(*args, **kwargs) 2025-12-04T13:38:32.1416262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1416305Z method(*args, **kwargs) 2025-12-04T13:38:32.1416455Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1416494Z with policy(): 2025-12-04T13:38:32.1416646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1416691Z raise RuntimeError(msg) 2025-12-04T13:38:32.1417048Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1417051Z 2025-12-04T13:38:32.1417124Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1417362Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1417365Z 2025-12-04T13:38:32.1417451Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1417453Z 2025-12-04T13:38:32.1417532Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1417583Z Traceback (most recent call last): 2025-12-04T13:38:32.1417746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1417806Z getattr(self, test_name)() 2025-12-04T13:38:32.1417967Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1418006Z fn() 2025-12-04T13:38:32.1418157Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1418203Z method(*args, **kwargs) 2025-12-04T13:38:32.1418355Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1418399Z method(*args, **kwargs) 2025-12-04T13:38:32.1418551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1418592Z with policy(): 2025-12-04T13:38:32.1418744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1418799Z raise RuntimeError(msg) 2025-12-04T13:38:32.1419154Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1419160Z 2025-12-04T13:38:32.1419234Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1419472Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1419475Z 2025-12-04T13:38:32.1419561Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1419672Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1419738Z ====================== 1 failed, 16 deselected in 23.38s ======================= 2025-12-04T13:38:32.1419780Z Got exit code 1 2025-12-04T13:38:32.1419821Z Retrying single test... 2025-12-04T13:38:32.1420014Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-177888eb9ff038f0.xml 2025-12-04T13:38:32.1420088Z ============================= test session starts ============================== 2025-12-04T13:38:32.1420207Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1420249Z cachedir: .pytest_cache 2025-12-04T13:38:32.1420413Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1420461Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1420505Z configfile: pytest.ini 2025-12-04T13:38:32.1420671Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1420752Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1420982Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1421031Z Running 1 items in this shard 2025-12-04T13:38:32.1421033Z 2025-12-04T13:38:32.1421351Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda I1204 13:22:37.364000 406617 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 406686 2025-12-04T13:38:32.1421522Z I1204 13:22:37.365000 406617 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 406687 2025-12-04T13:38:32.1421680Z I1204 13:22:37.366000 406617 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 406688 2025-12-04T13:38:32.1421853Z I1204 13:22:37.366000 406617 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 406689 2025-12-04T13:38:32.1422439Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1422479Z _warn_cpu_init() 2025-12-04T13:38:32.1422786Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1422829Z _init_core_state( 2025-12-04T13:38:32.1423342Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1423409Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1423983Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1424027Z _warn_cpu_init() 2025-12-04T13:38:32.1424331Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1424369Z _init_core_state( 2025-12-04T13:38:32.1424876Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1424939Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1425515Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1425555Z _warn_cpu_init() 2025-12-04T13:38:32.1425859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1425900Z _init_core_state( 2025-12-04T13:38:32.1426395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1426468Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1427034Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1427075Z _warn_cpu_init() 2025-12-04T13:38:32.1427568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1427638Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1427942Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1427979Z _init_core_state( 2025-12-04T13:38:32.1428471Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1428529Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1428823Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1428866Z return func(*args, **kwargs) 2025-12-04T13:38:32.1429363Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1429427Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1429945Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1430009Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1430242Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1430288Z return func(*args, **kwargs) 2025-12-04T13:38:32.1430540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1430587Z return func(*args, **kwargs) 2025-12-04T13:38:32.1430812Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1430868Z return func(*args, **kwargs) 2025-12-04T13:38:32.1431093Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1431134Z return func(*args, **kwargs) 2025-12-04T13:38:32.1431357Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1431398Z return func(*args, **kwargs) 2025-12-04T13:38:32.1431622Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1431663Z return func(*args, **kwargs) 2025-12-04T13:38:32.1431887Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1431943Z return func(*args, **kwargs) 2025-12-04T13:38:32.1432166Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1432208Z return func(*args, **kwargs) 2025-12-04T13:38:32.1432358Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1432524Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1432818Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1432979Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1433265Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1433408Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1433687Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1433838Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1434116Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1434267Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1434545Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1434683Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1434976Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1435137Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1435626Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1435741Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1435941Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1436310Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1436436Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1436654Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1436820Z [rank2]:E1204 13:22:46.073000 406688 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1436863Z dist init r=2, world=4 2025-12-04T13:38:32.1437002Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1437168Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1437457Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1437626Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1437913Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1438040Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1438319Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1438471Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1438752Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1438899Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1439189Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1439340Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1439649Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1439803Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1440288Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1440407Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1440616Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1440984Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1441101Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1441314Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1441483Z [rank0]:E1204 13:22:46.097000 406686 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1441524Z dist init r=0, world=4 2025-12-04T13:38:32.1441664Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1441825Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1442130Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1442285Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1442573Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1442702Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1442980Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1443132Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1443423Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1443573Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1443863Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1444004Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1444287Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1444436Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1444919Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1445044Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1445244Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1445613Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1445726Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1445944Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1446110Z [rank1]:E1204 13:22:46.136000 406687 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1446152Z dist init r=1, world=4 2025-12-04T13:38:32.1446312Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1446477Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1446766Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1446923Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1447210Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1447335Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1447613Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1447771Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1448060Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1448208Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1448486Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1448622Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1448909Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1449074Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1449555Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1449711Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1449908Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1450275Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1450393Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1450618Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1450786Z [rank3]:E1204 13:22:46.156000 406689 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1450825Z dist init r=3, world=4 2025-12-04T13:38:32.1451165Z [rank2]:[W1204 13:22:46.248926659 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1451500Z [rank0]:[W1204 13:22:46.329820663 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1451831Z [rank1]:[W1204 13:22:46.412140739 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1452172Z [rank3]:[W1204 13:22:46.425487487 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1452213Z FAILED [23.0390s] [100%] 2025-12-04T13:38:32.1452228Z 2025-12-04T13:38:32.1452289Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1452393Z _ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.1452443Z Traceback (most recent call last): 2025-12-04T13:38:32.1452609Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1452656Z self._join_processes(fn) 2025-12-04T13:38:32.1452831Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1452889Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1453069Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1453119Z raise RuntimeError(error) 2025-12-04T13:38:32.1453200Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1453263Z Traceback (most recent call last): 2025-12-04T13:38:32.1453425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1453472Z getattr(self, test_name)() 2025-12-04T13:38:32.1453633Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1453672Z fn() 2025-12-04T13:38:32.1453825Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1453869Z method(*args, **kwargs) 2025-12-04T13:38:32.1454027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1454069Z method(*args, **kwargs) 2025-12-04T13:38:32.1454223Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1454263Z with policy(): 2025-12-04T13:38:32.1454420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1454462Z raise RuntimeError(msg) 2025-12-04T13:38:32.1454833Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1454836Z 2025-12-04T13:38:32.1454912Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1455155Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1455158Z 2025-12-04T13:38:32.1455248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1455250Z 2025-12-04T13:38:32.1455255Z 2025-12-04T13:38:32.1455331Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1455422Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1455658Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-177888eb9ff038f0.xml - 2025-12-04T13:38:32.1455722Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1455985Z FAILED [23.0390s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1456035Z Traceback (most recent call last): 2025-12-04T13:38:32.1456212Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1456259Z getattr(self, test_name)() 2025-12-04T13:38:32.1456422Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1456461Z fn() 2025-12-04T13:38:32.1456615Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1456658Z method(*args, **kwargs) 2025-12-04T13:38:32.1456810Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1456853Z method(*args, **kwargs) 2025-12-04T13:38:32.1457005Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1457047Z with policy(): 2025-12-04T13:38:32.1457199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1457255Z raise RuntimeError(msg) 2025-12-04T13:38:32.1457616Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1457618Z 2025-12-04T13:38:32.1457693Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1457933Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1457936Z 2025-12-04T13:38:32.1458023Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1458091Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1458155Z ====================== 1 failed, 32 deselected in 23.20s ======================= 2025-12-04T13:38:32.1458195Z Got exit code 1 2025-12-04T13:38:32.1458236Z Retrying single test... 2025-12-04T13:38:32.1458430Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-16768b6b4b00c7ee.xml 2025-12-04T13:38:32.1458498Z ============================= test session starts ============================== 2025-12-04T13:38:32.1458617Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1458659Z cachedir: .pytest_cache 2025-12-04T13:38:32.1458820Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1458866Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1458912Z configfile: pytest.ini 2025-12-04T13:38:32.1459076Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1459154Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1459385Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1459431Z Running 1 items in this shard 2025-12-04T13:38:32.1459433Z 2025-12-04T13:38:32.1459794Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda I1204 13:23:03.068000 408027 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 408096 2025-12-04T13:38:32.1459954Z I1204 13:23:03.069000 408027 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 408097 2025-12-04T13:38:32.1460125Z I1204 13:23:03.069000 408027 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 408098 2025-12-04T13:38:32.1460278Z I1204 13:23:03.070000 408027 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 408099 2025-12-04T13:38:32.1460871Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1460917Z _warn_cpu_init() 2025-12-04T13:38:32.1461222Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1461285Z _init_core_state( 2025-12-04T13:38:32.1461778Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1461843Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1462418Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1462463Z _warn_cpu_init() 2025-12-04T13:38:32.1462767Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1462804Z _init_core_state( 2025-12-04T13:38:32.1463309Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1463373Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1463948Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1463990Z _warn_cpu_init() 2025-12-04T13:38:32.1464288Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1464330Z _init_core_state( 2025-12-04T13:38:32.1464831Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1464904Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1465477Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1465519Z _warn_cpu_init() 2025-12-04T13:38:32.1466012Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1466084Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1466572Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1466630Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1467120Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1467182Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1467495Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1467537Z _init_core_state( 2025-12-04T13:38:32.1468024Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1468088Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1468378Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1468426Z return func(*args, **kwargs) 2025-12-04T13:38:32.1468656Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1468702Z return func(*args, **kwargs) 2025-12-04T13:38:32.1468940Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1468982Z return func(*args, **kwargs) 2025-12-04T13:38:32.1469219Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1469261Z return func(*args, **kwargs) 2025-12-04T13:38:32.1469485Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1469527Z return func(*args, **kwargs) 2025-12-04T13:38:32.1469786Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1469827Z return func(*args, **kwargs) 2025-12-04T13:38:32.1470057Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1470100Z return func(*args, **kwargs) 2025-12-04T13:38:32.1470339Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1470380Z return func(*args, **kwargs) 2025-12-04T13:38:32.1470602Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1470645Z return func(*args, **kwargs) 2025-12-04T13:38:32.1470795Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1470963Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1471255Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1471417Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1471719Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1471849Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1472129Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1472281Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1472560Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1472712Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1472993Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1473143Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1473424Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1473588Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1474075Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1474196Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1474392Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1474773Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1474887Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1475102Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1475267Z [rank1]:E1204 13:23:11.843000 408097 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1475311Z dist init r=1, world=4 2025-12-04T13:38:32.1475447Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1475613Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1475902Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1476066Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1476355Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1476480Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1476760Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1476911Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1477193Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1477343Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1477631Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1477789Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1478068Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1478220Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1478703Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360. 2025-12-04T13:38:32.1478836Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1479035Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1479402Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1479519Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1479773Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1479944Z [rank3]:E1204 13:23:11.894000 408099 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1479984Z dist init r=3, world=4 2025-12-04T13:38:32.1480125Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1480298Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1480589Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1480749Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1481034Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1481162Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1481440Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1481597Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1481891Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1482057Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1482339Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1482477Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1482760Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1482910Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1483401Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104. 2025-12-04T13:38:32.1483530Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1483729Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1484099Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1484213Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1484430Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1484595Z [rank0]:E1204 13:23:11.913000 408096 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1484651Z dist init r=0, world=4 2025-12-04T13:38:32.1484789Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1484952Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1485238Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1485398Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1485686Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1485811Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1486105Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1486257Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1486552Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1486700Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1486980Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1487122Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1487401Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1487564Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1488045Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008. 2025-12-04T13:38:32.1488162Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1488358Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1488726Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1488842Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1489063Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1489233Z [rank2]:E1204 13:23:11.919000 408098 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1489273Z dist init r=2, world=4 2025-12-04T13:38:32.1489651Z [rank1]:[W1204 13:23:12.045593740 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1489980Z [rank3]:[W1204 13:23:12.154005590 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1490310Z [rank2]:[W1204 13:23:12.278942926 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1490652Z [rank0]:[W1204 13:23:12.290628555 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1490706Z FAILED [23.1388s] [100%] 2025-12-04T13:38:32.1490709Z 2025-12-04T13:38:32.1490771Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1490874Z _ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.1490924Z Traceback (most recent call last): 2025-12-04T13:38:32.1491090Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1491138Z self._join_processes(fn) 2025-12-04T13:38:32.1491314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1491372Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1491551Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1491600Z raise RuntimeError(error) 2025-12-04T13:38:32.1491693Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1491741Z Traceback (most recent call last): 2025-12-04T13:38:32.1491904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1491950Z getattr(self, test_name)() 2025-12-04T13:38:32.1492113Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1492149Z fn() 2025-12-04T13:38:32.1492304Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1492347Z method(*args, **kwargs) 2025-12-04T13:38:32.1492503Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1492545Z method(*args, **kwargs) 2025-12-04T13:38:32.1492704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1492742Z with policy(): 2025-12-04T13:38:32.1492901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1492942Z raise RuntimeError(msg) 2025-12-04T13:38:32.1493324Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1493327Z 2025-12-04T13:38:32.1493404Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1493647Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1493651Z 2025-12-04T13:38:32.1493742Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1493744Z 2025-12-04T13:38:32.1493746Z 2025-12-04T13:38:32.1493821Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1493912Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1494147Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-16768b6b4b00c7ee.xml - 2025-12-04T13:38:32.1494210Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1494484Z FAILED [23.1388s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1494544Z Traceback (most recent call last): 2025-12-04T13:38:32.1494711Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1494758Z getattr(self, test_name)() 2025-12-04T13:38:32.1494918Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1494957Z fn() 2025-12-04T13:38:32.1495109Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1495154Z method(*args, **kwargs) 2025-12-04T13:38:32.1495307Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1495351Z method(*args, **kwargs) 2025-12-04T13:38:32.1495504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1495556Z with policy(): 2025-12-04T13:38:32.1495715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1495757Z raise RuntimeError(msg) 2025-12-04T13:38:32.1496114Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224. 2025-12-04T13:38:32.1496116Z 2025-12-04T13:38:32.1496193Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1496433Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1496435Z 2025-12-04T13:38:32.1496522Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1496592Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1496656Z ====================== 1 failed, 32 deselected in 23.28s ======================= 2025-12-04T13:38:32.1496698Z Got exit code 1 2025-12-04T13:38:32.1496883Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.1497024Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.1497212Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f3169d33739d2fe6.xml 2025-12-04T13:38:32.1497275Z ============================= test session starts ============================== 2025-12-04T13:38:32.1497392Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1497436Z cachedir: .pytest_cache 2025-12-04T13:38:32.1497599Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1497646Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1497690Z configfile: pytest.ini 2025-12-04T13:38:32.1497854Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1497932Z collecting ... collected 60 items / 17 deselected / 43 selected 2025-12-04T13:38:32.1497986Z stepcurrent: skipping 17 already run items. 2025-12-04T13:38:32.1498033Z Running 16 items in this shard 2025-12-04T13:38:32.1498035Z 2025-12-04T13:38:32.1498386Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda I1204 13:23:28.993000 409437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 409506 2025-12-04T13:38:32.1498558Z I1204 13:23:28.994000 409437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 409507 2025-12-04T13:38:32.1498710Z I1204 13:23:28.994000 409437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 409508 2025-12-04T13:38:32.1498865Z I1204 13:23:28.995000 409437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 409509 2025-12-04T13:38:32.1499449Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1499489Z _warn_cpu_init() 2025-12-04T13:38:32.1500044Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1500109Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1500692Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1500734Z _warn_cpu_init() 2025-12-04T13:38:32.1501224Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1501303Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1501873Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1501915Z _warn_cpu_init() 2025-12-04T13:38:32.1502409Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1502468Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1503052Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1503104Z _warn_cpu_init() 2025-12-04T13:38:32.1503399Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1503483Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1503978Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1504040Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1504330Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1504428Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1504724Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1504806Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1505094Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1505177Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1505676Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1505745Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1506035Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1506113Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1506609Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1506669Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1506961Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1507037Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1507335Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1507429Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1507919Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1507982Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1508269Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1508348Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1509672Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1509816Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1510051Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1510099Z return func(*args, **kwargs) 2025-12-04T13:38:32.1511390Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1511520Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1511750Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1511793Z return func(*args, **kwargs) 2025-12-04T13:38:32.1513064Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1513203Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1513431Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1513487Z return func(*args, **kwargs) 2025-12-04T13:38:32.1514748Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1514873Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1515103Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1515145Z return func(*args, **kwargs) 2025-12-04T13:38:32.1515383Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1515425Z return func(*args, **kwargs) 2025-12-04T13:38:32.1515650Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1515692Z return func(*args, **kwargs) 2025-12-04T13:38:32.1515919Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1515960Z return func(*args, **kwargs) 2025-12-04T13:38:32.1516185Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1516226Z return func(*args, **kwargs) 2025-12-04T13:38:32.1516523Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1516579Z return func(*args, **kwargs) 2025-12-04T13:38:32.1516726Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1516903Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1517199Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1517359Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1517648Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1517776Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1518054Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1518218Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1518500Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1518649Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1518930Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1519070Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1519354Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1519516Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1520065Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17427333120. 2025-12-04T13:38:32.1520186Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1520382Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1520778Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1520892Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1521121Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1521304Z [rank3]:E1204 13:24:01.162000 409509 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1521345Z dist init r=3, world=4 2025-12-04T13:38:32.1521487Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1521647Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1521942Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1522097Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1522385Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1522525Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1522805Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1522956Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1523233Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1523385Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1523662Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1523815Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1524094Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1524247Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1524760Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17477664768. 2025-12-04T13:38:32.1524879Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1525077Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1525477Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1525605Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1525819Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1525987Z [rank2]:E1204 13:24:01.207000 409508 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1526033Z dist init r=2, world=4 2025-12-04T13:38:32.1526171Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1526334Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1526628Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1526803Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1527088Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1527215Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1527491Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1527641Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1527922Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1528069Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1528361Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1528499Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1528784Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1528935Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1529445Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17494441984. 2025-12-04T13:38:32.1529611Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1529808Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1530210Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1530324Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1530539Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1530707Z [rank1]:E1204 13:24:01.225000 409507 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1530749Z dist init r=1, world=4 2025-12-04T13:38:32.1530890Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1531064Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1531356Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1531510Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1531797Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1531922Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1532203Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1532350Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1532641Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1532793Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1533069Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1533209Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1533488Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1533641Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1534164Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17630756864. 2025-12-04T13:38:32.1534290Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1534488Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1534875Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1534992Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1535203Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1535382Z [rank0]:E1204 13:24:01.249000 409506 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1535420Z dist init r=0, world=4 2025-12-04T13:38:32.1535763Z [rank3]:[W1204 13:24:01.328484486 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1536102Z [rank2]:[W1204 13:24:01.445492076 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1536429Z [rank1]:[W1204 13:24:01.545642154 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1536758Z [rank0]:[W1204 13:24:01.611323837 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1536809Z FAILED [46.4742s] [ 6%] 2025-12-04T13:38:32.1536812Z 2025-12-04T13:38:32.1536873Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1537004Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.1537053Z Traceback (most recent call last): 2025-12-04T13:38:32.1537220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1537265Z self._join_processes(fn) 2025-12-04T13:38:32.1537443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1537498Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1537681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1537726Z raise RuntimeError(error) 2025-12-04T13:38:32.1537810Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1537856Z Traceback (most recent call last): 2025-12-04T13:38:32.1538033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1538076Z getattr(self, test_name)() 2025-12-04T13:38:32.1538240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1538288Z fn() 2025-12-04T13:38:32.1538445Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1538487Z method(*args, **kwargs) 2025-12-04T13:38:32.1538641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1538683Z method(*args, **kwargs) 2025-12-04T13:38:32.1538838Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1538876Z with policy(): 2025-12-04T13:38:32.1539033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1539075Z raise RuntimeError(msg) 2025-12-04T13:38:32.1539464Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17427333120. 2025-12-04T13:38:32.1539478Z 2025-12-04T13:38:32.1539557Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1539853Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1539855Z 2025-12-04T13:38:32.1539948Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1539951Z 2025-12-04T13:38:32.1539953Z 2025-12-04T13:38:32.1540029Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1540121Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1540356Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f3169d33739d2fe6.xml - 2025-12-04T13:38:32.1540421Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1540711Z FAILED [46.4742s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1540761Z Traceback (most recent call last): 2025-12-04T13:38:32.1540934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1540976Z getattr(self, test_name)() 2025-12-04T13:38:32.1541140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1541178Z fn() 2025-12-04T13:38:32.1541335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1541376Z method(*args, **kwargs) 2025-12-04T13:38:32.1541532Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1541573Z method(*args, **kwargs) 2025-12-04T13:38:32.1541730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1541767Z with policy(): 2025-12-04T13:38:32.1541922Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1541962Z raise RuntimeError(msg) 2025-12-04T13:38:32.1542363Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17427333120. 2025-12-04T13:38:32.1542389Z 2025-12-04T13:38:32.1542464Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1542729Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1542732Z 2025-12-04T13:38:32.1542821Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1542884Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1542949Z ====================== 1 failed, 17 deselected in 46.64s ======================= 2025-12-04T13:38:32.1542986Z Got exit code 1 2025-12-04T13:38:32.1543028Z Retrying single test... 2025-12-04T13:38:32.1543218Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3701308a73cd137b.xml 2025-12-04T13:38:32.1543291Z ============================= test session starts ============================== 2025-12-04T13:38:32.1543404Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1543447Z cachedir: .pytest_cache 2025-12-04T13:38:32.1543607Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1543654Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1543694Z configfile: pytest.ini 2025-12-04T13:38:32.1543860Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1543934Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1544193Z stepcurrent: skipping 17 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1544238Z Running 1 items in this shard 2025-12-04T13:38:32.1544240Z 2025-12-04T13:38:32.1544583Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda I1204 13:24:18.101000 410703 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 410772 2025-12-04T13:38:32.1544752Z I1204 13:24:18.102000 410703 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 410773 2025-12-04T13:38:32.1544903Z I1204 13:24:18.102000 410703 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 410774 2025-12-04T13:38:32.1545055Z I1204 13:24:18.103000 410703 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 410775 2025-12-04T13:38:32.1545638Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1545678Z _warn_cpu_init() 2025-12-04T13:38:32.1546179Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1546243Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1546833Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1546871Z _warn_cpu_init() 2025-12-04T13:38:32.1547365Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1547426Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1548016Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1548054Z _warn_cpu_init() 2025-12-04T13:38:32.1548541Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1548602Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1549179Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1549220Z _warn_cpu_init() 2025-12-04T13:38:32.1549518Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1549636Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1549928Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1550008Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1550506Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1550565Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1550868Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1550962Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1551451Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1551510Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1551802Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1551883Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1552169Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1552260Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1552548Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1552621Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1553110Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1553171Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1553455Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1553546Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1554035Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1554095Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1554385Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1554459Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1555741Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1555880Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1557142Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1557276Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1557506Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1557551Z return func(*args, **kwargs) 2025-12-04T13:38:32.1557775Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1557820Z return func(*args, **kwargs) 2025-12-04T13:38:32.1559085Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1559207Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1559436Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1559478Z return func(*args, **kwargs) 2025-12-04T13:38:32.1560788Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1560924Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1561149Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1561192Z return func(*args, **kwargs) 2025-12-04T13:38:32.1561416Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1561471Z return func(*args, **kwargs) 2025-12-04T13:38:32.1561692Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1561732Z return func(*args, **kwargs) 2025-12-04T13:38:32.1561955Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1561995Z return func(*args, **kwargs) 2025-12-04T13:38:32.1562214Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1562255Z return func(*args, **kwargs) 2025-12-04T13:38:32.1562549Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1562589Z return func(*args, **kwargs) 2025-12-04T13:38:32.1562734Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1562910Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1563204Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1563360Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1563647Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1563772Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1564051Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1564213Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1564489Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1564651Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1564927Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1565066Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1565347Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1565495Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1566020Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17427333120. 2025-12-04T13:38:32.1566137Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1566336Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1566731Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1566846Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1567062Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1567237Z [rank3]:E1204 13:24:50.291000 410775 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1567278Z dist init r=3, world=4 2025-12-04T13:38:32.1567416Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1567579Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1567867Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1568024Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1568312Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1568435Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1568726Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1568885Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1569161Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1569308Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1569687Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1569827Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1570106Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1570269Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1570783Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17494441984. 2025-12-04T13:38:32.1570899Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1571098Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1571490Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1571619Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1571833Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1571998Z [rank1]:E1204 13:24:50.296000 410773 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1572038Z dist init r=1, world=4 2025-12-04T13:38:32.1572177Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1572336Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1572625Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1572779Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1573078Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1573216Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1573494Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1573643Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1573917Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1574066Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1574344Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1574491Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1574771Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1574918Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1575429Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17477664768. 2025-12-04T13:38:32.1575545Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1575762Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1576153Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1576265Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1576480Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1576643Z [rank2]:E1204 13:24:50.343000 410774 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1576684Z dist init r=2, world=4 2025-12-04T13:38:32.1576821Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1576982Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1577278Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1577444Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1577733Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1577858Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1578135Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1578283Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1578560Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1578718Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1578997Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1579135Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1579413Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1579563Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1580120Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17630756864. 2025-12-04T13:38:32.1580239Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1580434Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1580822Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1580939Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1581150Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1581314Z [rank0]:E1204 13:24:50.348000 410772 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1581352Z dist init r=0, world=4 2025-12-04T13:38:32.1581701Z [rank3]:[W1204 13:24:50.447180058 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1582042Z [rank1]:[W1204 13:24:50.466136064 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1582370Z [rank0]:[W1204 13:24:50.598576057 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1582701Z [rank2]:[W1204 13:24:50.619712354 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1582741Z FAILED [46.4716s] [100%] 2025-12-04T13:38:32.1582744Z 2025-12-04T13:38:32.1582804Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1582943Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.1582992Z Traceback (most recent call last): 2025-12-04T13:38:32.1583155Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1583201Z self._join_processes(fn) 2025-12-04T13:38:32.1583375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1583431Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1583611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1583656Z raise RuntimeError(error) 2025-12-04T13:38:32.1583736Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1583784Z Traceback (most recent call last): 2025-12-04T13:38:32.1583946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1583990Z getattr(self, test_name)() 2025-12-04T13:38:32.1584161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1584196Z fn() 2025-12-04T13:38:32.1584352Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1584392Z method(*args, **kwargs) 2025-12-04T13:38:32.1584547Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1584587Z method(*args, **kwargs) 2025-12-04T13:38:32.1584742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1584780Z with policy(): 2025-12-04T13:38:32.1584936Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1584977Z raise RuntimeError(msg) 2025-12-04T13:38:32.1585366Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17427333120. 2025-12-04T13:38:32.1585368Z 2025-12-04T13:38:32.1585454Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1585719Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1585732Z 2025-12-04T13:38:32.1585820Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1585824Z 2025-12-04T13:38:32.1585826Z 2025-12-04T13:38:32.1585901Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1585991Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1586225Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3701308a73cd137b.xml - 2025-12-04T13:38:32.1586288Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1586564Z FAILED [46.4716s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1586612Z Traceback (most recent call last): 2025-12-04T13:38:32.1586788Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1586832Z getattr(self, test_name)() 2025-12-04T13:38:32.1586994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1587031Z fn() 2025-12-04T13:38:32.1587186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1587228Z method(*args, **kwargs) 2025-12-04T13:38:32.1587381Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1587426Z method(*args, **kwargs) 2025-12-04T13:38:32.1587578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1587618Z with policy(): 2025-12-04T13:38:32.1587774Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1587814Z raise RuntimeError(msg) 2025-12-04T13:38:32.1588212Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17427333120. 2025-12-04T13:38:32.1588214Z 2025-12-04T13:38:32.1588288Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1588550Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1588553Z 2025-12-04T13:38:32.1588640Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1588706Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1588768Z ====================== 1 failed, 32 deselected in 46.63s ======================= 2025-12-04T13:38:32.1588807Z Got exit code 1 2025-12-04T13:38:32.1588847Z Retrying single test... 2025-12-04T13:38:32.1589039Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2e31e92306f47555.xml 2025-12-04T13:38:32.1589097Z ============================= test session starts ============================== 2025-12-04T13:38:32.1589213Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1589265Z cachedir: .pytest_cache 2025-12-04T13:38:32.1589426Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1589485Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1589526Z configfile: pytest.ini 2025-12-04T13:38:32.1589726Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1589801Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1590059Z stepcurrent: skipping 17 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1590103Z Running 1 items in this shard 2025-12-04T13:38:32.1590105Z 2025-12-04T13:38:32.1590444Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda I1204 13:25:07.172000 411969 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 412038 2025-12-04T13:38:32.1590601Z I1204 13:25:07.173000 411969 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 412039 2025-12-04T13:38:32.1590784Z I1204 13:25:07.174000 411969 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 412040 2025-12-04T13:38:32.1590933Z I1204 13:25:07.174000 411969 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 412041 2025-12-04T13:38:32.1591514Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1591554Z _warn_cpu_init() 2025-12-04T13:38:32.1592045Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1592109Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1592692Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1592733Z _warn_cpu_init() 2025-12-04T13:38:32.1593223Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1593284Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1593870Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1593920Z _warn_cpu_init() 2025-12-04T13:38:32.1594413Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1594474Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1595049Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1595099Z _warn_cpu_init() 2025-12-04T13:38:32.1595389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1595473Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1595966Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1596024Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1596317Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1596398Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1596919Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1596977Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1597264Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1597346Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1597635Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1597714Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1598000Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1598077Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1598375Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1598462Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1598954Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1599014Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1599302Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1599381Z shared = FSDP(shared, group, **fsdp_kwargs) # type: ignore[assignment] 2025-12-04T13:38:32.1599920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1599994Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1600282Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1600356Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.1601647Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1601776Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1602007Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1602056Z return func(*args, **kwargs) 2025-12-04T13:38:32.1603325Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1603465Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1604730Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1604862Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1605093Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1605135Z return func(*args, **kwargs) 2025-12-04T13:38:32.1605362Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1605404Z return func(*args, **kwargs) 2025-12-04T13:38:32.1606677Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1606800Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1607026Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1607069Z return func(*args, **kwargs) 2025-12-04T13:38:32.1607292Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1607336Z return func(*args, **kwargs) 2025-12-04T13:38:32.1607566Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1607610Z return func(*args, **kwargs) 2025-12-04T13:38:32.1607849Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1607892Z return func(*args, **kwargs) 2025-12-04T13:38:32.1608113Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1608155Z return func(*args, **kwargs) 2025-12-04T13:38:32.1608452Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1608492Z return func(*args, **kwargs) 2025-12-04T13:38:32.1608639Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1608802Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1609106Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1609261Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1609549Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1609708Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1609988Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1610142Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1610434Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1610584Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1610860Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1611002Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1616225Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1616383Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1616928Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17494441984. 2025-12-04T13:38:32.1617063Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1617262Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1617658Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1617776Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1617990Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1618160Z [rank1]:E1204 13:25:39.390000 412039 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1618215Z dist init r=1, world=4 2025-12-04T13:38:32.1618358Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1618519Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1618815Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1618969Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1619255Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1619385Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1619715Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1619869Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1620146Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1620294Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1620573Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1620712Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1620993Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1621154Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1621668Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17477664768. 2025-12-04T13:38:32.1621797Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1621996Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1622382Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1622498Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1622729Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1622893Z [rank2]:E1204 13:25:39.392000 412040 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1622940Z dist init r=2, world=4 2025-12-04T13:38:32.1623077Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1623241Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1623529Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1623688Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1623973Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1624111Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1624392Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1624538Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1624819Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1624966Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1625245Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1625392Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1625674Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1625837Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1626344Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17630756864. 2025-12-04T13:38:32.1626460Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1626656Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1627045Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1627173Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1627386Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1627553Z [rank0]:E1204 13:25:39.424000 412038 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1627592Z dist init r=0, world=4 2025-12-04T13:38:32.1627733Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1627895Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1628183Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1628347Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1628635Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1628762Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1629042Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1629192Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1629471Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1629653Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1629965Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1630122Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1630403Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1630554Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1631063Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17427333120. 2025-12-04T13:38:32.1631191Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1631389Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1631774Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1631889Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1632103Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1632267Z [rank3]:E1204 13:25:39.436000 412041 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1632310Z dist init r=3, world=4 2025-12-04T13:38:32.1632660Z [rank2]:[W1204 13:25:39.556824788 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1632992Z [rank1]:[W1204 13:25:39.605396743 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1633320Z [rank3]:[W1204 13:25:39.709245835 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1633650Z [rank0]:[W1204 13:25:39.747508723 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1633696Z FAILED [46.4710s] [100%] 2025-12-04T13:38:32.1633699Z 2025-12-04T13:38:32.1633757Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1633889Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.1633936Z Traceback (most recent call last): 2025-12-04T13:38:32.1634115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1634160Z self._join_processes(fn) 2025-12-04T13:38:32.1634347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1634403Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1634586Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1634630Z raise RuntimeError(error) 2025-12-04T13:38:32.1634714Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1634761Z Traceback (most recent call last): 2025-12-04T13:38:32.1634926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1634970Z getattr(self, test_name)() 2025-12-04T13:38:32.1635130Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1635167Z fn() 2025-12-04T13:38:32.1635332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1635373Z method(*args, **kwargs) 2025-12-04T13:38:32.1635528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1635568Z method(*args, **kwargs) 2025-12-04T13:38:32.1635723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1635764Z with policy(): 2025-12-04T13:38:32.1635917Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1635961Z raise RuntimeError(msg) 2025-12-04T13:38:32.1636347Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17477664768. 2025-12-04T13:38:32.1636352Z 2025-12-04T13:38:32.1636430Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1636703Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1636706Z 2025-12-04T13:38:32.1636798Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1636800Z 2025-12-04T13:38:32.1636802Z 2025-12-04T13:38:32.1636882Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1636973Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1637212Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2e31e92306f47555.xml - 2025-12-04T13:38:32.1637275Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1637557Z FAILED [46.4710s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1637603Z Traceback (most recent call last): 2025-12-04T13:38:32.1637773Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1637815Z getattr(self, test_name)() 2025-12-04T13:38:32.1637990Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1638025Z fn() 2025-12-04T13:38:32.1638180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1638232Z method(*args, **kwargs) 2025-12-04T13:38:32.1638386Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1638426Z method(*args, **kwargs) 2025-12-04T13:38:32.1638580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1638618Z with policy(): 2025-12-04T13:38:32.1638776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1638816Z raise RuntimeError(msg) 2025-12-04T13:38:32.1639210Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17477664768. 2025-12-04T13:38:32.1639226Z 2025-12-04T13:38:32.1639307Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1639606Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1639608Z 2025-12-04T13:38:32.1639700Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1639764Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1639831Z ====================== 1 failed, 32 deselected in 46.62s ======================= 2025-12-04T13:38:32.1639869Z Got exit code 1 2025-12-04T13:38:32.1640082Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda 2025-12-04T13:38:32.1640213Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.1640404Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9797a90b2134b2b2.xml 2025-12-04T13:38:32.1640462Z ============================= test session starts ============================== 2025-12-04T13:38:32.1640595Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1640637Z cachedir: .pytest_cache 2025-12-04T13:38:32.1640800Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1640849Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1640891Z configfile: pytest.ini 2025-12-04T13:38:32.1641058Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1641134Z collecting ... collected 60 items / 18 deselected / 42 selected 2025-12-04T13:38:32.1641189Z stepcurrent: skipping 18 already run items. 2025-12-04T13:38:32.1641232Z Running 15 items in this shard 2025-12-04T13:38:32.1641234Z 2025-12-04T13:38:32.1641572Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda I1204 13:25:56.344000 413235 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 413304 2025-12-04T13:38:32.1641727Z I1204 13:25:56.344000 413235 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 413305 2025-12-04T13:38:32.1641897Z I1204 13:25:56.345000 413235 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 413306 2025-12-04T13:38:32.1642047Z I1204 13:25:56.346000 413235 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 413307 2025-12-04T13:38:32.1642650Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1642694Z _warn_cpu_init() 2025-12-04T13:38:32.1642996Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1643036Z _init_core_state( 2025-12-04T13:38:32.1643531Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1643611Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1644191Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1644231Z _warn_cpu_init() 2025-12-04T13:38:32.1644529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1644567Z _init_core_state( 2025-12-04T13:38:32.1645080Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1645140Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1645713Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1645754Z _warn_cpu_init() 2025-12-04T13:38:32.1646046Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1646086Z _init_core_state( 2025-12-04T13:38:32.1646588Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1646650Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1647231Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1647271Z _warn_cpu_init() 2025-12-04T13:38:32.1647759Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1647819Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1648318Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1648376Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1648673Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1648711Z _init_core_state( 2025-12-04T13:38:32.1649208Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1649270Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1649818Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1649879Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1651160Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1651287Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1652558Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1652684Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1653956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1654076Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1655336Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1655460Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1655690Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1655736Z return func(*args, **kwargs) 2025-12-04T13:38:32.1655974Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1656018Z return func(*args, **kwargs) 2025-12-04T13:38:32.1656245Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1656299Z return func(*args, **kwargs) 2025-12-04T13:38:32.1656525Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1656566Z return func(*args, **kwargs) 2025-12-04T13:38:32.1656790Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1656830Z return func(*args, **kwargs) 2025-12-04T13:38:32.1657053Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1657093Z return func(*args, **kwargs) 2025-12-04T13:38:32.1657316Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1657367Z return func(*args, **kwargs) 2025-12-04T13:38:32.1657589Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1657628Z return func(*args, **kwargs) 2025-12-04T13:38:32.1657924Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1657964Z return func(*args, **kwargs) 2025-12-04T13:38:32.1658112Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1658278Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1658575Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1658743Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1659030Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1659158Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1659435Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1659629Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1659911Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1660060Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1660352Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1660503Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1660788Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1660939Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1661452Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1661570Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1661788Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1662177Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1662290Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1662505Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1662670Z [rank1]:E1204 13:26:28.721000 413305 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1662714Z dist init r=1, world=4 2025-12-04T13:38:32.1662851Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1663013Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1663318Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1663475Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1663764Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1663889Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1664169Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1664316Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1664603Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1664755Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1665043Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1665182Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1665462Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1665614Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1666124Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1666253Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1666451Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1666839Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1666955Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1667167Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1667336Z [rank2]:E1204 13:26:28.731000 413306 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1667385Z dist init r=2, world=4 2025-12-04T13:38:32.1667526Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1667686Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1667978Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1668136Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1668424Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1668551Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1668837Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1668988Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1669275Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1669423Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1669747Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1669884Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1670163Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1670327Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1670838Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1670953Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1671148Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1671531Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1671644Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1671870Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1672034Z [rank0]:E1204 13:26:28.744000 413304 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1672078Z dist init r=0, world=4 2025-12-04T13:38:32.1672216Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1672379Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1672668Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1672822Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1673125Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1673249Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1673541Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1673689Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1673967Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1674115Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1674391Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1674540Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1674819Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1674970Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1675480Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1675597Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1675795Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1676189Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1676304Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1676515Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1676681Z [rank3]:E1204 13:26:28.748000 413307 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1676720Z dist init r=3, world=4 2025-12-04T13:38:32.1677059Z [rank1]:[W1204 13:26:28.883919614 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1677388Z [rank2]:[W1204 13:26:28.925505089 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1677726Z [rank0]:[W1204 13:26:29.004387514 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1678074Z [rank3]:[W1204 13:26:29.019036206 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1678115Z FAILED [46.7694s] [ 6%] 2025-12-04T13:38:32.1678117Z 2025-12-04T13:38:32.1678177Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1678300Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda _ 2025-12-04T13:38:32.1678349Z Traceback (most recent call last): 2025-12-04T13:38:32.1678515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1678561Z self._join_processes(fn) 2025-12-04T13:38:32.1678735Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1678803Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1678984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1679028Z raise RuntimeError(error) 2025-12-04T13:38:32.1679112Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1679157Z Traceback (most recent call last): 2025-12-04T13:38:32.1679321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1679363Z getattr(self, test_name)() 2025-12-04T13:38:32.1679526Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1679562Z fn() 2025-12-04T13:38:32.1679755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1679798Z method(*args, **kwargs) 2025-12-04T13:38:32.1679953Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1679993Z method(*args, **kwargs) 2025-12-04T13:38:32.1680164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1680203Z with policy(): 2025-12-04T13:38:32.1680358Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1680401Z raise RuntimeError(msg) 2025-12-04T13:38:32.1680787Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1680791Z 2025-12-04T13:38:32.1680868Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1681128Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1681132Z 2025-12-04T13:38:32.1681223Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1681225Z 2025-12-04T13:38:32.1681285Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1681332Z Traceback (most recent call last): 2025-12-04T13:38:32.1681510Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1681553Z getattr(self, test_name)() 2025-12-04T13:38:32.1681730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1681765Z fn() 2025-12-04T13:38:32.1681919Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1681958Z method(*args, **kwargs) 2025-12-04T13:38:32.1682111Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1682150Z method(*args, **kwargs) 2025-12-04T13:38:32.1682303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1682340Z with policy(): 2025-12-04T13:38:32.1682496Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1682537Z raise RuntimeError(msg) 2025-12-04T13:38:32.1682921Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1682937Z 2025-12-04T13:38:32.1683011Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1683270Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1683272Z 2025-12-04T13:38:32.1683362Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1683365Z 2025-12-04T13:38:32.1683366Z 2025-12-04T13:38:32.1683441Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1683535Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1683770Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9797a90b2134b2b2.xml - 2025-12-04T13:38:32.1683832Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1684117Z FAILED [46.7694s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1684165Z Traceback (most recent call last): 2025-12-04T13:38:32.1684329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1684373Z getattr(self, test_name)() 2025-12-04T13:38:32.1684533Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1684571Z fn() 2025-12-04T13:38:32.1684723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1684765Z method(*args, **kwargs) 2025-12-04T13:38:32.1684915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1684957Z method(*args, **kwargs) 2025-12-04T13:38:32.1685110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1685146Z with policy(): 2025-12-04T13:38:32.1685309Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1685350Z raise RuntimeError(msg) 2025-12-04T13:38:32.1685729Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1685745Z 2025-12-04T13:38:32.1685818Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1686074Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1686076Z 2025-12-04T13:38:32.1686161Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1686164Z 2025-12-04T13:38:32.1686225Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1686269Z Traceback (most recent call last): 2025-12-04T13:38:32.1686433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1686488Z getattr(self, test_name)() 2025-12-04T13:38:32.1686646Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1686682Z fn() 2025-12-04T13:38:32.1686832Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1686874Z method(*args, **kwargs) 2025-12-04T13:38:32.1687026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1687069Z method(*args, **kwargs) 2025-12-04T13:38:32.1687221Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1687260Z with policy(): 2025-12-04T13:38:32.1687411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1687455Z raise RuntimeError(msg) 2025-12-04T13:38:32.1687831Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1687833Z 2025-12-04T13:38:32.1687919Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1688175Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1688180Z 2025-12-04T13:38:32.1688267Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1688332Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1688397Z ====================== 1 failed, 18 deselected in 46.93s ======================= 2025-12-04T13:38:32.1688436Z Got exit code 1 2025-12-04T13:38:32.1688475Z Retrying single test... 2025-12-04T13:38:32.1688668Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-abfa87f8a65806d1.xml 2025-12-04T13:38:32.1688726Z ============================= test session starts ============================== 2025-12-04T13:38:32.1688841Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1688882Z cachedir: .pytest_cache 2025-12-04T13:38:32.1689052Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1689098Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1689140Z configfile: pytest.ini 2025-12-04T13:38:32.1689315Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1689391Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1689689Z stepcurrent: skipping 18 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1689735Z Running 1 items in this shard 2025-12-04T13:38:32.1689738Z 2025-12-04T13:38:32.1690068Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda I1204 13:26:45.567000 414501 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 414570 2025-12-04T13:38:32.1690224Z I1204 13:26:45.568000 414501 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 414571 2025-12-04T13:38:32.1690378Z I1204 13:26:45.568000 414501 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 414572 2025-12-04T13:38:32.1690543Z I1204 13:26:45.569000 414501 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 414573 2025-12-04T13:38:32.1691128Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1691167Z _warn_cpu_init() 2025-12-04T13:38:32.1691469Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1691508Z _init_core_state( 2025-12-04T13:38:32.1692017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1692081Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1692651Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1692692Z _warn_cpu_init() 2025-12-04T13:38:32.1692986Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1693026Z _init_core_state( 2025-12-04T13:38:32.1693537Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1693600Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1694175Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1694228Z _warn_cpu_init() 2025-12-04T13:38:32.1694528Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1694565Z _init_core_state( 2025-12-04T13:38:32.1695056Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1695130Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1695699Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1695739Z _warn_cpu_init() 2025-12-04T13:38:32.1696229Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1696289Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1696791Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1696849Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1697146Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1697184Z _init_core_state( 2025-12-04T13:38:32.1697674Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1697731Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1698228Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1698299Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1699614Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1699758Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1701019Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1701146Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1702406Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1702528Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1703790Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1703926Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1704157Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1704202Z return func(*args, **kwargs) 2025-12-04T13:38:32.1704426Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1704482Z return func(*args, **kwargs) 2025-12-04T13:38:32.1704703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1704746Z return func(*args, **kwargs) 2025-12-04T13:38:32.1704966Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1705008Z return func(*args, **kwargs) 2025-12-04T13:38:32.1705228Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1705270Z return func(*args, **kwargs) 2025-12-04T13:38:32.1705490Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1705533Z return func(*args, **kwargs) 2025-12-04T13:38:32.1705753Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1705793Z return func(*args, **kwargs) 2025-12-04T13:38:32.1706025Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1706065Z return func(*args, **kwargs) 2025-12-04T13:38:32.1706358Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1706399Z return func(*args, **kwargs) 2025-12-04T13:38:32.1706547Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1706712Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1707004Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1707160Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1707458Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1707595Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1707871Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1708022Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1708298Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1708448Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1708722Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1708872Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1709152Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1709299Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1709847Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1709964Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1710187Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1710571Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1710686Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1710901Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1711066Z [rank0]:E1204 13:27:17.952000 414570 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1711109Z dist init r=0, world=4 2025-12-04T13:38:32.1711247Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1711409Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1711711Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1711879Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1712164Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1712290Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1712568Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1712716Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1712995Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1713155Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1713433Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1713572Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1713854Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1714004Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1714517Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1714633Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1714830Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1715215Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1715331Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1715542Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1715707Z [rank3]:E1204 13:27:17.959000 414573 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1715745Z dist init r=3, world=4 2025-12-04T13:38:32.1715896Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1716055Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1716360Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1716513Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1716800Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1716924Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1717202Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1717363Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1717639Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1717787Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1718063Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1718201Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1718480Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1718630Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1719146Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1719259Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1719457Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1719877Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1719991Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1720220Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1720383Z [rank2]:E1204 13:27:18.017000 414572 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1720439Z dist init r=2, world=4 2025-12-04T13:38:32.1720576Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1720737Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1721025Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1721181Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1721464Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1721605Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1721884Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1722032Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1722311Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1722457Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1722735Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1722870Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1723165Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1723315Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1723816Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1723932Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1724128Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1724520Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1724632Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1724857Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1725023Z [rank1]:E1204 13:27:18.038000 414571 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1725061Z dist init r=1, world=4 2025-12-04T13:38:32.1725405Z [rank0]:[W1204 13:27:18.111540154 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1725737Z [rank3]:[W1204 13:27:18.129101688 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1726083Z [rank2]:[W1204 13:27:18.279274038 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1726410Z [rank1]:[W1204 13:27:18.327299371 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1726452Z FAILED [46.6698s] [100%] 2025-12-04T13:38:32.1726454Z 2025-12-04T13:38:32.1726514Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1726636Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda _ 2025-12-04T13:38:32.1726684Z Traceback (most recent call last): 2025-12-04T13:38:32.1726849Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1726895Z self._join_processes(fn) 2025-12-04T13:38:32.1727068Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1727125Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1727314Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1727360Z raise RuntimeError(error) 2025-12-04T13:38:32.1727439Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1727487Z Traceback (most recent call last): 2025-12-04T13:38:32.1727650Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1727695Z getattr(self, test_name)() 2025-12-04T13:38:32.1727853Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1727889Z fn() 2025-12-04T13:38:32.1728040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1728083Z method(*args, **kwargs) 2025-12-04T13:38:32.1728236Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1728278Z method(*args, **kwargs) 2025-12-04T13:38:32.1728429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1728478Z with policy(): 2025-12-04T13:38:32.1728634Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1728686Z raise RuntimeError(msg) 2025-12-04T13:38:32.1729067Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1729070Z 2025-12-04T13:38:32.1729146Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1729406Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1729409Z 2025-12-04T13:38:32.1729496Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1729498Z 2025-12-04T13:38:32.1729500Z 2025-12-04T13:38:32.1729616Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1729719Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1729958Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-abfa87f8a65806d1.xml - 2025-12-04T13:38:32.1730021Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1730293Z FAILED [46.6698s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1730341Z Traceback (most recent call last): 2025-12-04T13:38:32.1730508Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1730552Z getattr(self, test_name)() 2025-12-04T13:38:32.1730712Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1730750Z fn() 2025-12-04T13:38:32.1730901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1730942Z method(*args, **kwargs) 2025-12-04T13:38:32.1731092Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1731147Z method(*args, **kwargs) 2025-12-04T13:38:32.1731298Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1731337Z with policy(): 2025-12-04T13:38:32.1731489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1731532Z raise RuntimeError(msg) 2025-12-04T13:38:32.1731908Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1731913Z 2025-12-04T13:38:32.1731987Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1732247Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1732249Z 2025-12-04T13:38:32.1732336Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1732414Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1732477Z ====================== 1 failed, 32 deselected in 46.83s ======================= 2025-12-04T13:38:32.1732531Z Got exit code 1 2025-12-04T13:38:32.1732572Z Retrying single test... 2025-12-04T13:38:32.1732764Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7a493fb1ea0337b2.xml 2025-12-04T13:38:32.1732821Z ============================= test session starts ============================== 2025-12-04T13:38:32.1732936Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1732977Z cachedir: .pytest_cache 2025-12-04T13:38:32.1733138Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1733183Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1733225Z configfile: pytest.ini 2025-12-04T13:38:32.1733390Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1733469Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1733733Z stepcurrent: skipping 18 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1733777Z Running 1 items in this shard 2025-12-04T13:38:32.1733780Z 2025-12-04T13:38:32.1734113Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda I1204 13:27:34.820000 415767 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 415836 2025-12-04T13:38:32.1734268Z I1204 13:27:34.821000 415767 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 415837 2025-12-04T13:38:32.1734422Z I1204 13:27:34.821000 415767 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 415838 2025-12-04T13:38:32.1734573Z I1204 13:27:34.822000 415767 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 415839 2025-12-04T13:38:32.1735172Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1735211Z _warn_cpu_init() 2025-12-04T13:38:32.1735509Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1735548Z _init_core_state( 2025-12-04T13:38:32.1736038Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1736103Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1736688Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1736727Z _warn_cpu_init() 2025-12-04T13:38:32.1737038Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1737076Z _init_core_state( 2025-12-04T13:38:32.1737570Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1737630Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1738202Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1738250Z _warn_cpu_init() 2025-12-04T13:38:32.1738546Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1738584Z _init_core_state( 2025-12-04T13:38:32.1739076Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1739137Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1739756Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1739795Z _warn_cpu_init() 2025-12-04T13:38:32.1740288Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1740347Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1740838Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1740894Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1741202Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1. 2025-12-04T13:38:32.1741239Z _init_core_state( 2025-12-04T13:38:32.1741731Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1741811Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1742298Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1742356Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1743618Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1743759Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1745025Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1745153Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1746411Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1746545Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1747797Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1747929Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1748157Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1748203Z return func(*args, **kwargs) 2025-12-04T13:38:32.1748427Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1748471Z return func(*args, **kwargs) 2025-12-04T13:38:32.1748693Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1748736Z return func(*args, **kwargs) 2025-12-04T13:38:32.1748958Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1749009Z return func(*args, **kwargs) 2025-12-04T13:38:32.1749233Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1749273Z return func(*args, **kwargs) 2025-12-04T13:38:32.1749494Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1749535Z return func(*args, **kwargs) 2025-12-04T13:38:32.1749792Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1749832Z return func(*args, **kwargs) 2025-12-04T13:38:32.1750055Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1750095Z return func(*args, **kwargs) 2025-12-04T13:38:32.1750406Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1750446Z return func(*args, **kwargs) 2025-12-04T13:38:32.1750592Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1750767Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1751058Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1751215Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1751501Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1751627Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1751904Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1752068Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1752347Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1752497Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1752774Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1752913Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1753191Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1753351Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1753864Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1753982Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1754176Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1754571Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1754697Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1754909Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1755085Z [rank2]:E1204 13:28:07.055000 415838 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1755126Z dist init r=2, world=4 2025-12-04T13:38:32.1755261Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1755424Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1755714Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1755868Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1756155Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1756290Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1756569Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1756716Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1756997Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1757144Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1757420Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1757569Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1757846Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1757997Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1758498Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1758615Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1758813Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1759220Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1759345Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1759555Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1759764Z [rank3]:E1204 13:28:07.060000 415839 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1759802Z dist init r=3, world=4 2025-12-04T13:38:32.1759940Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1760101Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1760389Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1760563Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1760847Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1760972Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1761248Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1761399Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1761677Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1761841Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1762119Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1762255Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1762534Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1762682Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1763187Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1763312Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1763509Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1763908Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1764020Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1764232Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1764396Z [rank1]:E1204 13:28:07.084000 415837 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1764436Z dist init r=1, world=4 2025-12-04T13:38:32.1764571Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1764747Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1765032Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1765187Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1765473Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1765594Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1765873Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1766021Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1766314Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1766462Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1766738Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1766877Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1767154Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1767302Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1767818Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1767947Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1768141Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1768527Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1768641Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1768851Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1769031Z [rank0]:E1204 13:28:07.107000 415836 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1769069Z dist init r=0, world=4 2025-12-04T13:38:32.1769407Z [rank2]:[W1204 13:28:07.224533174 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1769944Z [rank3]:[W1204 13:28:07.238802770 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1770299Z [rank1]:[W1204 13:28:07.268634057 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1770627Z [rank0]:[W1204 13:28:07.358891388 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1770680Z FAILED [46.5692s] [100%] 2025-12-04T13:38:32.1770682Z 2025-12-04T13:38:32.1770740Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1770861Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda _ 2025-12-04T13:38:32.1770910Z Traceback (most recent call last): 2025-12-04T13:38:32.1771075Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1771121Z self._join_processes(fn) 2025-12-04T13:38:32.1771295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1771352Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1771529Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1771575Z raise RuntimeError(error) 2025-12-04T13:38:32.1771654Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1771701Z Traceback (most recent call last): 2025-12-04T13:38:32.1771884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1771926Z getattr(self, test_name)() 2025-12-04T13:38:32.1772087Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1772136Z fn() 2025-12-04T13:38:32.1772291Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1772332Z method(*args, **kwargs) 2025-12-04T13:38:32.1772487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1772527Z method(*args, **kwargs) 2025-12-04T13:38:32.1772680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1772717Z with policy(): 2025-12-04T13:38:32.1772874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1772915Z raise RuntimeError(msg) 2025-12-04T13:38:32.1773297Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1773317Z 2025-12-04T13:38:32.1773392Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1773651Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1773653Z 2025-12-04T13:38:32.1773742Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1773744Z 2025-12-04T13:38:32.1773803Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1773851Z Traceback (most recent call last): 2025-12-04T13:38:32.1774014Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1774059Z getattr(self, test_name)() 2025-12-04T13:38:32.1774219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1774255Z fn() 2025-12-04T13:38:32.1774405Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1774446Z method(*args, **kwargs) 2025-12-04T13:38:32.1774618Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1774659Z method(*args, **kwargs) 2025-12-04T13:38:32.1774811Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1774849Z with policy(): 2025-12-04T13:38:32.1775002Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1775046Z raise RuntimeError(msg) 2025-12-04T13:38:32.1775424Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1775428Z 2025-12-04T13:38:32.1775503Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1775765Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1775767Z 2025-12-04T13:38:32.1775867Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1775870Z 2025-12-04T13:38:32.1775871Z 2025-12-04T13:38:32.1775959Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1776050Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1776289Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7a493fb1ea0337b2.xml - 2025-12-04T13:38:32.1776349Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1776628Z FAILED [46.5692s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.1776673Z Traceback (most recent call last): 2025-12-04T13:38:32.1776842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1776884Z getattr(self, test_name)() 2025-12-04T13:38:32.1777052Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1777104Z fn() 2025-12-04T13:38:32.1777254Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1777295Z method(*args, **kwargs) 2025-12-04T13:38:32.1777449Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1777490Z method(*args, **kwargs) 2025-12-04T13:38:32.1777641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1777679Z with policy(): 2025-12-04T13:38:32.1777836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1777878Z raise RuntimeError(msg) 2025-12-04T13:38:32.1778258Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1778261Z 2025-12-04T13:38:32.1778337Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1778606Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1778608Z 2025-12-04T13:38:32.1778696Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1778698Z 2025-12-04T13:38:32.1778760Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1778803Z Traceback (most recent call last): 2025-12-04T13:38:32.1778968Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1779010Z getattr(self, test_name)() 2025-12-04T13:38:32.1779170Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1779204Z fn() 2025-12-04T13:38:32.1779359Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1779397Z method(*args, **kwargs) 2025-12-04T13:38:32.1779550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1779630Z method(*args, **kwargs) 2025-12-04T13:38:32.1779798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1779834Z with policy(): 2025-12-04T13:38:32.1780003Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1780044Z raise RuntimeError(msg) 2025-12-04T13:38:32.1780423Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1780426Z 2025-12-04T13:38:32.1780498Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1780754Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1780756Z 2025-12-04T13:38:32.1780844Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1780909Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1780991Z ====================== 1 failed, 32 deselected in 46.73s ======================= 2025-12-04T13:38:32.1781027Z Got exit code 1 2025-12-04T13:38:32.1781237Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda 2025-12-04T13:38:32.1781366Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.1781556Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-033de2995758149b.xml 2025-12-04T13:38:32.1781614Z ============================= test session starts ============================== 2025-12-04T13:38:32.1781729Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1781771Z cachedir: .pytest_cache 2025-12-04T13:38:32.1781932Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1781979Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1782021Z configfile: pytest.ini 2025-12-04T13:38:32.1782187Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1782282Z collecting ... collected 60 items / 19 deselected / 41 selected 2025-12-04T13:38:32.1782336Z stepcurrent: skipping 19 already run items. 2025-12-04T13:38:32.1782382Z Running 14 items in this shard 2025-12-04T13:38:32.1782384Z 2025-12-04T13:38:32.1782732Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda I1204 13:28:23.837000 417033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 417102 2025-12-04T13:38:32.1782887Z I1204 13:28:23.837000 417033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 417103 2025-12-04T13:38:32.1783040Z I1204 13:28:23.838000 417033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 417104 2025-12-04T13:38:32.1783190Z I1204 13:28:23.838000 417033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 417105 2025-12-04T13:38:32.1783788Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1783836Z _warn_cpu_init() 2025-12-04T13:38:32.1784145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1784186Z _init_core_state( 2025-12-04T13:38:32.1784679Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1784744Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1785323Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1785377Z _warn_cpu_init() 2025-12-04T13:38:32.1785680Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1785719Z _init_core_state( 2025-12-04T13:38:32.1786210Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1786272Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1786860Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1786897Z _warn_cpu_init() 2025-12-04T13:38:32.1787200Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1787236Z _init_core_state( 2025-12-04T13:38:32.1787726Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1787788Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1788377Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1788426Z _warn_cpu_init() 2025-12-04T13:38:32.1788911Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1788972Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1789465Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1789522Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1789862Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1789924Z _init_core_state( 2025-12-04T13:38:32.1790417Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1790477Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1790959Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1791019Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1792302Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1792430Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1793703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1793841Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1795095Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1795234Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1796509Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1796633Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1796859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1796906Z return func(*args, **kwargs) 2025-12-04T13:38:32.1797128Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1797172Z return func(*args, **kwargs) 2025-12-04T13:38:32.1797393Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1797436Z return func(*args, **kwargs) 2025-12-04T13:38:32.1797668Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1797710Z return func(*args, **kwargs) 2025-12-04T13:38:32.1797930Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1797987Z return func(*args, **kwargs) 2025-12-04T13:38:32.1798209Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1798248Z return func(*args, **kwargs) 2025-12-04T13:38:32.1798471Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1798510Z return func(*args, **kwargs) 2025-12-04T13:38:32.1798740Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1798779Z return func(*args, **kwargs) 2025-12-04T13:38:32.1799073Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1799129Z return func(*args, **kwargs) 2025-12-04T13:38:32.1799277Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1799440Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1799773Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1799930Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1800220Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1800349Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1800645Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1800797Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1801075Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1801226Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1801505Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1801644Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1801943Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1802091Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1802634Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1802751Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1802949Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1803347Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1803479Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1803692Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1803856Z [rank3]:E1204 13:28:56.096000 417105 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1803898Z dist init r=3, world=4 2025-12-04T13:38:32.1804034Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1804195Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1804482Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1804639Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1804942Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1805065Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1805344Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1805493Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1805773Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1805920Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1806213Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1806350Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1806645Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1806795Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1807312Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1807428Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1807623Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1808032Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1808148Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1808357Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1808524Z [rank1]:E1204 13:28:56.105000 417103 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1808563Z dist init r=1, world=4 2025-12-04T13:38:32.1808701Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1808861Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1809164Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1809320Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1809839Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1809969Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1810247Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1810398Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1810674Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1810839Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1811115Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1811268Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1811549Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1811697Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1812215Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1812345Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1812544Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1812940Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1813054Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1813267Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1813431Z [rank2]:E1204 13:28:56.173000 417104 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1813473Z dist init r=2, world=4 2025-12-04T13:38:32.1813627Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1813790Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1814082Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1814245Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1814536Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1814662Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1814947Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1815109Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1815391Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1815549Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1815828Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1815966Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1816244Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1816395Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1816922Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1817040Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1817235Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1817629Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1817746Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1817970Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1818138Z [rank0]:E1204 13:28:56.174000 417102 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1818177Z dist init r=0, world=4 2025-12-04T13:38:32.1818517Z [rank3]:[W1204 13:28:56.274381314 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1818849Z [rank1]:[W1204 13:28:56.314020855 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1819182Z [rank0]:[W1204 13:28:56.570280825 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1819520Z [rank2]:[W1204 13:28:56.575162453 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1819561Z FAILED [46.4685s] [ 7%] 2025-12-04T13:38:32.1819610Z 2025-12-04T13:38:32.1819675Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1819810Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.1819860Z Traceback (most recent call last): 2025-12-04T13:38:32.1820024Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1820072Z self._join_processes(fn) 2025-12-04T13:38:32.1820247Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1820306Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1820485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1820532Z raise RuntimeError(error) 2025-12-04T13:38:32.1820614Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1820679Z Traceback (most recent call last): 2025-12-04T13:38:32.1820842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1820887Z getattr(self, test_name)() 2025-12-04T13:38:32.1821049Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1821086Z fn() 2025-12-04T13:38:32.1821243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1821285Z method(*args, **kwargs) 2025-12-04T13:38:32.1821442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1821483Z method(*args, **kwargs) 2025-12-04T13:38:32.1821638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1821677Z with policy(): 2025-12-04T13:38:32.1821834Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1821876Z raise RuntimeError(msg) 2025-12-04T13:38:32.1822284Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1822287Z 2025-12-04T13:38:32.1822364Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1822637Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1822641Z 2025-12-04T13:38:32.1822731Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1822733Z 2025-12-04T13:38:32.1822735Z 2025-12-04T13:38:32.1822810Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1822902Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1823136Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-033de2995758149b.xml - 2025-12-04T13:38:32.1823199Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1823500Z FAILED [46.4685s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1823563Z Traceback (most recent call last): 2025-12-04T13:38:32.1823729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1823775Z getattr(self, test_name)() 2025-12-04T13:38:32.1823937Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1823976Z fn() 2025-12-04T13:38:32.1824129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1824174Z method(*args, **kwargs) 2025-12-04T13:38:32.1824327Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1824370Z method(*args, **kwargs) 2025-12-04T13:38:32.1824525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1824574Z with policy(): 2025-12-04T13:38:32.1824729Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1824771Z raise RuntimeError(msg) 2025-12-04T13:38:32.1825162Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1825164Z 2025-12-04T13:38:32.1825238Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1825509Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1825513Z 2025-12-04T13:38:32.1825600Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1825668Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1825732Z ====================== 1 failed, 19 deselected in 46.63s ======================= 2025-12-04T13:38:32.1825775Z Got exit code 1 2025-12-04T13:38:32.1825816Z Retrying single test... 2025-12-04T13:38:32.1826021Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0cf86fdbf3893144.xml 2025-12-04T13:38:32.1826081Z ============================= test session starts ============================== 2025-12-04T13:38:32.1826199Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1826244Z cachedir: .pytest_cache 2025-12-04T13:38:32.1826403Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1826455Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1826496Z configfile: pytest.ini 2025-12-04T13:38:32.1826663Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1826737Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1827003Z stepcurrent: skipping 19 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1827048Z Running 1 items in this shard 2025-12-04T13:38:32.1827050Z 2025-12-04T13:38:32.1827400Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda I1204 13:29:12.963000 418299 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 418368 2025-12-04T13:38:32.1827567Z I1204 13:29:12.963000 418299 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 418369 2025-12-04T13:38:32.1827724Z I1204 13:29:12.964000 418299 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 418370 2025-12-04T13:38:32.1827875Z I1204 13:29:12.964000 418299 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 418371 2025-12-04T13:38:32.1828465Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1828507Z _warn_cpu_init() 2025-12-04T13:38:32.1828814Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1828866Z _init_core_state( 2025-12-04T13:38:32.1829360Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1829426Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1830036Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1830076Z _warn_cpu_init() 2025-12-04T13:38:32.1830395Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1830434Z _init_core_state( 2025-12-04T13:38:32.1830930Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1830993Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1831566Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1831607Z _warn_cpu_init() 2025-12-04T13:38:32.1831920Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1831963Z _init_core_state( 2025-12-04T13:38:32.1832467Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1832531Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1833104Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1833143Z _warn_cpu_init() 2025-12-04T13:38:32.1833640Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1833713Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1834203Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1834264Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1834753Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1834828Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1835128Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1835169Z _init_core_state( 2025-12-04T13:38:32.1835653Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1835715Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1836995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1837134Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1838408Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1838546Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1839859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1839981Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1841233Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1841375Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1841609Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1841669Z return func(*args, **kwargs) 2025-12-04T13:38:32.1841897Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1841939Z return func(*args, **kwargs) 2025-12-04T13:38:32.1842166Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1842207Z return func(*args, **kwargs) 2025-12-04T13:38:32.1842433Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1842473Z return func(*args, **kwargs) 2025-12-04T13:38:32.1842695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1842750Z return func(*args, **kwargs) 2025-12-04T13:38:32.1842971Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1843012Z return func(*args, **kwargs) 2025-12-04T13:38:32.1843236Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1843278Z return func(*args, **kwargs) 2025-12-04T13:38:32.1843503Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1843547Z return func(*args, **kwargs) 2025-12-04T13:38:32.1843841Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1843886Z return func(*args, **kwargs) 2025-12-04T13:38:32.1844032Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1844211Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1844504Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1844664Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1844951Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1845079Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1845359Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1845508Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1845801Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1845961Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1846242Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1846381Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1846663Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1846815Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1847335Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1847468Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1847664Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1848066Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1848183Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1848399Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1848577Z [rank2]:E1204 13:29:45.487000 418370 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1848617Z dist init r=2, world=4 2025-12-04T13:38:32.1848757Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1848917Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1849209Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1849364Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1849706Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1849830Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1850127Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1850291Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1850570Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1850722Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1850998Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1851139Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1851418Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1851581Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1852095Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1852211Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1852409Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1852804Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1852939Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1853151Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1853320Z [rank1]:E1204 13:29:45.492000 418369 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1853461Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1853622Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1853912Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1854066Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1854365Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1854489Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1854780Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1854930Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1855213Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1855363Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1855641Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1855797Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1856079Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1856230Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1856747Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1856862Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1857059Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1857466Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1857584Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1857794Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1857963Z [rank0]:E1204 13:29:45.492000 418368 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1858006Z dist init r=1, world=4 2025-12-04T13:38:32.1858045Z dist init r=0, world=4 2025-12-04T13:38:32.1858187Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1858348Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1858648Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1858802Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1859103Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1859226Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1859507Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1859697Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1859973Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1860137Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1860418Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1860559Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1860839Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1860990Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1861521Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1861635Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1861834Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1862224Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1862342Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1862552Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1862719Z [rank3]:E1204 13:29:45.569000 418371 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1862761Z dist init r=3, world=4 2025-12-04T13:38:32.1863111Z [rank2]:[W1204 13:29:45.653916674 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1863459Z [rank0]:[W1204 13:29:45.677580620 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1863792Z [rank1]:[W1204 13:29:45.682040213 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1864122Z [rank3]:[W1204 13:29:45.839237476 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1864163Z FAILED [46.8716s] [100%] 2025-12-04T13:38:32.1864170Z 2025-12-04T13:38:32.1864227Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1864377Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.1864424Z Traceback (most recent call last): 2025-12-04T13:38:32.1864592Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1864637Z self._join_processes(fn) 2025-12-04T13:38:32.1864813Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1864868Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1865050Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1865094Z raise RuntimeError(error) 2025-12-04T13:38:32.1865180Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1865227Z Traceback (most recent call last): 2025-12-04T13:38:32.1865392Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1865435Z getattr(self, test_name)() 2025-12-04T13:38:32.1865597Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1865644Z fn() 2025-12-04T13:38:32.1865801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1865843Z method(*args, **kwargs) 2025-12-04T13:38:32.1865999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1866041Z method(*args, **kwargs) 2025-12-04T13:38:32.1866196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1866236Z with policy(): 2025-12-04T13:38:32.1866393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1866435Z raise RuntimeError(msg) 2025-12-04T13:38:32.1866825Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1866827Z 2025-12-04T13:38:32.1866908Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1867190Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1867203Z 2025-12-04T13:38:32.1867294Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1867296Z 2025-12-04T13:38:32.1867298Z 2025-12-04T13:38:32.1867373Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1867463Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1867699Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0cf86fdbf3893144.xml - 2025-12-04T13:38:32.1867763Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1868046Z FAILED [46.8716s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.1868093Z Traceback (most recent call last): 2025-12-04T13:38:32.1868274Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1868317Z getattr(self, test_name)() 2025-12-04T13:38:32.1868481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1868517Z fn() 2025-12-04T13:38:32.1868673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1868715Z method(*args, **kwargs) 2025-12-04T13:38:32.1868870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1868911Z method(*args, **kwargs) 2025-12-04T13:38:32.1869066Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1869105Z with policy(): 2025-12-04T13:38:32.1869261Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1869302Z raise RuntimeError(msg) 2025-12-04T13:38:32.1869745Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1869748Z 2025-12-04T13:38:32.1869825Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1870094Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1870098Z 2025-12-04T13:38:32.1870187Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1870252Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1870319Z ====================== 1 failed, 32 deselected in 47.03s ======================= 2025-12-04T13:38:32.1870357Z Got exit code 1 2025-12-04T13:38:32.1870401Z Retrying single test... 2025-12-04T13:38:32.1870592Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8217f4cff6fb20ac.xml 2025-12-04T13:38:32.1870653Z ============================= test session starts ============================== 2025-12-04T13:38:32.1870766Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1870824Z cachedir: .pytest_cache 2025-12-04T13:38:32.1870984Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1871057Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1871099Z configfile: pytest.ini 2025-12-04T13:38:32.1871267Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1871342Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1871607Z stepcurrent: skipping 19 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1871651Z Running 1 items in this shard 2025-12-04T13:38:32.1871657Z 2025-12-04T13:38:32.1871998Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda I1204 13:30:02.345000 419565 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 419634 2025-12-04T13:38:32.1872159Z I1204 13:30:02.345000 419565 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 419635 2025-12-04T13:38:32.1872329Z I1204 13:30:02.346000 419565 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 419636 2025-12-04T13:38:32.1876232Z I1204 13:30:02.347000 419565 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 419637 2025-12-04T13:38:32.1876826Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1876868Z _warn_cpu_init() 2025-12-04T13:38:32.1877179Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1877221Z _init_core_state( 2025-12-04T13:38:32.1877738Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1877807Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1878379Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1878421Z _warn_cpu_init() 2025-12-04T13:38:32.1878726Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1878763Z _init_core_state( 2025-12-04T13:38:32.1879267Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1879340Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1879946Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1879986Z _warn_cpu_init() 2025-12-04T13:38:32.1880285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1880324Z _init_core_state( 2025-12-04T13:38:32.1880813Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1880894Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1881468Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1881507Z _warn_cpu_init() 2025-12-04T13:38:32.1882003Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1882062Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1882571Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1882629Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1882929Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.SHARD_GRAD_OP since the world size is 1. 2025-12-04T13:38:32.1882970Z _init_core_state( 2025-12-04T13:38:32.1883458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1883519Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1884018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.1884094Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.1885358Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1885496Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1886757Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1886895Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1888152Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1888273Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1889541Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1889733Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1889963Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1890024Z return func(*args, **kwargs) 2025-12-04T13:38:32.1890248Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1890294Z return func(*args, **kwargs) 2025-12-04T13:38:32.1890518Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1890559Z return func(*args, **kwargs) 2025-12-04T13:38:32.1890783Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1890824Z return func(*args, **kwargs) 2025-12-04T13:38:32.1891049Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1891092Z return func(*args, **kwargs) 2025-12-04T13:38:32.1891314Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1891354Z return func(*args, **kwargs) 2025-12-04T13:38:32.1891590Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1891632Z return func(*args, **kwargs) 2025-12-04T13:38:32.1891855Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1891896Z return func(*args, **kwargs) 2025-12-04T13:38:32.1892192Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1892237Z return func(*args, **kwargs) 2025-12-04T13:38:32.1892383Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1892549Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1892867Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1893024Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1893323Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1893451Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1893729Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1893880Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1894161Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1894320Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1894600Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1894737Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1895018Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1895165Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1895686Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17486053376. 2025-12-04T13:38:32.1895815Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1896011Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1896412Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1896528Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1896743Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1896910Z [rank2]:E1204 13:30:34.582000 419636 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1897050Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1897223Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1897508Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1897676Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1897960Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1898087Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1898364Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1898513Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1898804Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1898951Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1899232Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1899368Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1899698Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1899849Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1900388Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1900505Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1900699Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1901094Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1901207Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1901419Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1901597Z [rank1]:E1204 13:30:34.582000 419635 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1901640Z dist init r=2, world=4 2025-12-04T13:38:32.1901678Z dist init r=1, world=4 2025-12-04T13:38:32.1901835Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1901999Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1902287Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1902442Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1902729Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1902857Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1903145Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1903298Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1903577Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1903723Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1904001Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1904138Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1904431Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1904579Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1905096Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17639145472. 2025-12-04T13:38:32.1905213Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1905408Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1905802Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1905927Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1906140Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1906316Z [rank0]:E1204 13:30:34.640000 419634 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1906356Z dist init r=0, world=4 2025-12-04T13:38:32.1906494Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1906655Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1906944Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1907097Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1907396Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1907520Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1907799Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1907946Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1908225Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1908374Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1908665Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1908803Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1909082Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1909230Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1909780Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17435721728. 2025-12-04T13:38:32.1909895Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1910113Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1910505Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1910634Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1910847Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1911013Z [rank3]:E1204 13:30:34.666000 419637 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1911052Z dist init r=3, world=4 2025-12-04T13:38:32.1911391Z [rank2]:[W1204 13:30:34.756873150 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1911735Z [rank1]:[W1204 13:30:34.757099897 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1912065Z [rank0]:[W1204 13:30:34.906269204 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1912392Z [rank3]:[W1204 13:30:34.945333733 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1912434Z FAILED [46.5703s] [100%] 2025-12-04T13:38:32.1912437Z 2025-12-04T13:38:32.1912498Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1912631Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.1912680Z Traceback (most recent call last): 2025-12-04T13:38:32.1912845Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1912888Z self._join_processes(fn) 2025-12-04T13:38:32.1913077Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1913132Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1913312Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1913358Z raise RuntimeError(error) 2025-12-04T13:38:32.1913441Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1913487Z Traceback (most recent call last): 2025-12-04T13:38:32.1913651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1913694Z getattr(self, test_name)() 2025-12-04T13:38:32.1913854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1913889Z fn() 2025-12-04T13:38:32.1914046Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1914087Z method(*args, **kwargs) 2025-12-04T13:38:32.1914254Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1914295Z method(*args, **kwargs) 2025-12-04T13:38:32.1914448Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1914497Z with policy(): 2025-12-04T13:38:32.1914654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1914695Z raise RuntimeError(msg) 2025-12-04T13:38:32.1915083Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1915086Z 2025-12-04T13:38:32.1915164Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1915431Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1915444Z 2025-12-04T13:38:32.1915535Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1915537Z 2025-12-04T13:38:32.1915538Z 2025-12-04T13:38:32.1915615Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1915706Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1915942Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8217f4cff6fb20ac.xml - 2025-12-04T13:38:32.1916006Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1916288Z FAILED [46.5703s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.1916338Z Traceback (most recent call last): 2025-12-04T13:38:32.1916512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1916553Z getattr(self, test_name)() 2025-12-04T13:38:32.1916717Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1916752Z fn() 2025-12-04T13:38:32.1916917Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1916959Z method(*args, **kwargs) 2025-12-04T13:38:32.1917114Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1917154Z method(*args, **kwargs) 2025-12-04T13:38:32.1917308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1917346Z with policy(): 2025-12-04T13:38:32.1917504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1917544Z raise RuntimeError(msg) 2025-12-04T13:38:32.1917935Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17502830592. 2025-12-04T13:38:32.1917937Z 2025-12-04T13:38:32.1918011Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1918290Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1918303Z 2025-12-04T13:38:32.1918393Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1918457Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1918521Z ====================== 1 failed, 32 deselected in 46.73s ======================= 2025-12-04T13:38:32.1918559Z Got exit code 1 2025-12-04T13:38:32.1918777Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.1918905Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.1919098Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9b67bb4b5b795d1e.xml 2025-12-04T13:38:32.1919156Z ============================= test session starts ============================== 2025-12-04T13:38:32.1919275Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1919334Z cachedir: .pytest_cache 2025-12-04T13:38:32.1919494Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1919541Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1919622Z configfile: pytest.ini 2025-12-04T13:38:32.1919787Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1919865Z collecting ... collected 60 items / 20 deselected / 40 selected 2025-12-04T13:38:32.1919918Z stepcurrent: skipping 20 already run items. 2025-12-04T13:38:32.1919964Z Running 13 items in this shard 2025-12-04T13:38:32.1919966Z 2025-12-04T13:38:32.1920284Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda I1204 13:30:51.378000 420831 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 420900 2025-12-04T13:38:32.1920440Z I1204 13:30:51.379000 420831 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 420901 2025-12-04T13:38:32.1920596Z I1204 13:30:51.379000 420831 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 420902 2025-12-04T13:38:32.1920760Z I1204 13:30:51.380000 420831 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 420903 2025-12-04T13:38:32.1921056Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1921104Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.1921692Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1921732Z _warn_cpu_init() 2025-12-04T13:38:32.1922016Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1922065Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.1922355Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1922402Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.1922990Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1923033Z _warn_cpu_init() 2025-12-04T13:38:32.1923605Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1923665Z _warn_cpu_init() 2025-12-04T13:38:32.1923956Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1924044Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.1924332Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1924417Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.1924705Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1924790Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.1925069Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1925114Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.1925697Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1925738Z _warn_cpu_init() 2025-12-04T13:38:32.1926028Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1926116Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.1926347Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1926392Z return func(*args, **kwargs) 2025-12-04T13:38:32.1926618Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1926674Z return func(*args, **kwargs) 2025-12-04T13:38:32.1926896Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1926951Z return func(*args, **kwargs) 2025-12-04T13:38:32.1927175Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1927215Z return func(*args, **kwargs) 2025-12-04T13:38:32.1927440Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1927480Z return func(*args, **kwargs) 2025-12-04T13:38:32.1927702Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1927741Z return func(*args, **kwargs) 2025-12-04T13:38:32.1927960Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1928013Z return func(*args, **kwargs) 2025-12-04T13:38:32.1928234Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1928273Z return func(*args, **kwargs) 2025-12-04T13:38:32.1928568Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1928609Z return func(*args, **kwargs) 2025-12-04T13:38:32.1929943Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1930076Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1931352Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1931475Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1932741Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1932877Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1934135Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1934254Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1934400Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1934574Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1934868Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1935026Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1935311Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1935438Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1935718Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1935870Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1936160Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1936319Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1936597Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1936735Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1937016Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1937164Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1937656Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 180736 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.1937785Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1937982Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1938358Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1938472Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1938686Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1938861Z [rank3]:E1204 13:30:59.014000 420903 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1938903Z dist init r=3, world=4 2025-12-04T13:38:32.1939041Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1939205Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1939493Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1939686Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1939972Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1940095Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1940385Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1940553Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1940833Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1940982Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1941256Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1941394Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1941671Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1941834Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1942320Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 184832 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T13:38:32.1942437Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1942636Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1943008Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1943138Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1943349Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1943517Z [rank2]:E1204 13:30:59.015000 420902 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1943555Z dist init r=2, world=4 2025-12-04T13:38:32.1943693Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1943855Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1944144Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1944302Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1944596Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1944722Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1945009Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1945158Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1945435Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1945585Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1945863Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1946009Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1946290Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1946443Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1946931Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 186880 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T13:38:32.1947051Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1947246Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1947629Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1947748Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1947959Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1948130Z [rank0]:E1204 13:30:59.082000 420900 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1948169Z dist init r=0, world=4 2025-12-04T13:38:32.1948309Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1948470Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1948769Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1948923Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1949222Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1949346Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1949659Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1949810Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1950088Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1950255Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1950531Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1950672Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1950950Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1951103Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1951591Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 188928 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.1951720Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1951917Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1952286Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1952402Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1952615Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1952780Z [rank1]:E1204 13:30:59.096000 420901 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1952822Z dist init r=1, world=4 2025-12-04T13:38:32.1953170Z [rank0]:[W1204 13:30:59.349308445 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1953214Z FAILED [9.5190s] [ 7%] 2025-12-04T13:38:32.1953216Z 2025-12-04T13:38:32.1953287Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1953399Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.1953446Z Traceback (most recent call last): 2025-12-04T13:38:32.1953614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1953658Z self._join_processes(fn) 2025-12-04T13:38:32.1953836Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1953891Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1954074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1954119Z raise RuntimeError(error) 2025-12-04T13:38:32.1954203Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1954252Z Traceback (most recent call last): 2025-12-04T13:38:32.1954429Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1954474Z getattr(self, test_name)() 2025-12-04T13:38:32.1954634Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1954674Z fn() 2025-12-04T13:38:32.1954828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1954874Z method(*args, **kwargs) 2025-12-04T13:38:32.1955027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1955071Z method(*args, **kwargs) 2025-12-04T13:38:32.1955222Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1955266Z with policy(): 2025-12-04T13:38:32.1955420Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1955465Z raise RuntimeError(msg) 2025-12-04T13:38:32.1955836Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 180736 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.1955839Z 2025-12-04T13:38:32.1955918Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1956160Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1956162Z 2025-12-04T13:38:32.1956253Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1956256Z 2025-12-04T13:38:32.1956258Z 2025-12-04T13:38:32.1956338Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1956426Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1956665Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9b67bb4b5b795d1e.xml - 2025-12-04T13:38:32.1956727Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1957002Z FAILED [9.5190s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1957049Z Traceback (most recent call last): 2025-12-04T13:38:32.1957219Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1957273Z getattr(self, test_name)() 2025-12-04T13:38:32.1957437Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1957474Z fn() 2025-12-04T13:38:32.1957632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1957673Z method(*args, **kwargs) 2025-12-04T13:38:32.1957828Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1957868Z method(*args, **kwargs) 2025-12-04T13:38:32.1958022Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1958060Z with policy(): 2025-12-04T13:38:32.1958216Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1958272Z raise RuntimeError(msg) 2025-12-04T13:38:32.1958635Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 180736 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.1958637Z 2025-12-04T13:38:32.1958717Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1958959Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1958962Z 2025-12-04T13:38:32.1959054Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1959118Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.1959186Z ======================= 1 failed, 20 deselected in 9.68s ======================= 2025-12-04T13:38:32.1959226Z Got exit code 1 2025-12-04T13:38:32.1959272Z Retrying single test... 2025-12-04T13:38:32.1959463Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-09076c94ea287f91.xml 2025-12-04T13:38:32.1959540Z ============================= test session starts ============================== 2025-12-04T13:38:32.1959686Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.1959731Z cachedir: .pytest_cache 2025-12-04T13:38:32.1959892Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.1959943Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.1959984Z configfile: pytest.ini 2025-12-04T13:38:32.1960151Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.1960230Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.1960464Z stepcurrent: skipping 20 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1960510Z Running 1 items in this shard 2025-12-04T13:38:32.1960513Z 2025-12-04T13:38:32.1960827Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda I1204 13:31:03.493000 421233 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 421302 2025-12-04T13:38:32.1961000Z I1204 13:31:03.494000 421233 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 421303 2025-12-04T13:38:32.1961155Z I1204 13:31:03.495000 421233 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 421304 2025-12-04T13:38:32.1961325Z I1204 13:31:03.495000 421233 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 421305 2025-12-04T13:38:32.1961614Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1961668Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.1962248Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1962302Z _warn_cpu_init() 2025-12-04T13:38:32.1962589Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1962635Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.1962917Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1962962Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.1963536Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1963579Z _warn_cpu_init() 2025-12-04T13:38:32.1964174Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1964215Z _warn_cpu_init() 2025-12-04T13:38:32.1964505Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1964599Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.1964884Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1964973Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.1965260Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1965358Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.1965640Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1965697Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.1966276Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.1966313Z _warn_cpu_init() 2025-12-04T13:38:32.1966606Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.1966694Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.1966936Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1966983Z return func(*args, **kwargs) 2025-12-04T13:38:32.1967208Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1967255Z return func(*args, **kwargs) 2025-12-04T13:38:32.1967478Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1967523Z return func(*args, **kwargs) 2025-12-04T13:38:32.1967745Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.1967791Z return func(*args, **kwargs) 2025-12-04T13:38:32.1968012Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1968056Z return func(*args, **kwargs) 2025-12-04T13:38:32.1968289Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1968333Z return func(*args, **kwargs) 2025-12-04T13:38:32.1968556Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1968600Z return func(*args, **kwargs) 2025-12-04T13:38:32.1968823Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.1968866Z return func(*args, **kwargs) 2025-12-04T13:38:32.1969162Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.1969203Z return func(*args, **kwargs) 2025-12-04T13:38:32.1970540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1970683Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1971943Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1972086Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1973368Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1973493Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1974757Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.1974882Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.1975048Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1975212Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1975510Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1975667Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1975958Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1976086Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1976382Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1976539Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1976817Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1976969Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1977252Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1977395Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1977691Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1977844Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1978341Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 188928 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.1978458Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1978658Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1979028Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1979159Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1979380Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1979557Z [rank3]:E1204 13:31:11.015000 421305 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.1979670Z dist init r=3, world=4 2025-12-04T13:38:32.1979809Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1979973Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1980264Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1980422Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1980726Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1980853Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1981135Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1981283Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1981565Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1981714Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1981993Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1982147Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1982430Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1982582Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1983075Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 188928 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T13:38:32.1983194Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1983391Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1983778Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1983909Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1984125Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1984295Z [rank0]:E1204 13:31:11.027000 421302 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.1984334Z dist init r=0, world=4 2025-12-04T13:38:32.1984476Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1984638Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1984930Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1985115Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1985411Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1985535Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1985818Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1985973Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1986252Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1986417Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1986697Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1986839Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1987121Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1987273Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1987771Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 184832 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T13:38:32.1987895Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1988094Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1988473Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1988594Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1988810Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1988983Z [rank2]:E1204 13:31:11.071000 421304 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.1989027Z dist init r=2, world=4 2025-12-04T13:38:32.1989170Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.1989360Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.1989687Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1989851Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.1990150Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1990279Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.1990565Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1990720Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1991023Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1991176Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.1991462Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1991602Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.1991891Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1992042Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.1992562Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 188928 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.1992697Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1992897Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1993276Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1993393Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.1993612Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1993794Z [rank1]:E1204 13:31:11.073000 421303 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.1993837Z dist init r=1, world=4 2025-12-04T13:38:32.1994184Z [rank0]:[W1204 13:31:11.228304080 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.1994225Z FAILED [9.4192s] [100%] 2025-12-04T13:38:32.1994227Z 2025-12-04T13:38:32.1994289Z =================================== FAILURES =================================== 2025-12-04T13:38:32.1994400Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.1994452Z Traceback (most recent call last): 2025-12-04T13:38:32.1994623Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.1994673Z self._join_processes(fn) 2025-12-04T13:38:32.1994854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.1994914Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.1995108Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.1995159Z raise RuntimeError(error) 2025-12-04T13:38:32.1995242Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1995292Z Traceback (most recent call last): 2025-12-04T13:38:32.1995459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1995507Z getattr(self, test_name)() 2025-12-04T13:38:32.1995672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1995713Z fn() 2025-12-04T13:38:32.1995870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1995915Z method(*args, **kwargs) 2025-12-04T13:38:32.1996073Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1996118Z method(*args, **kwargs) 2025-12-04T13:38:32.1996277Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1996316Z with policy(): 2025-12-04T13:38:32.1996489Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1996533Z raise RuntimeError(msg) 2025-12-04T13:38:32.1996918Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 188928 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.1996922Z 2025-12-04T13:38:32.1996999Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.1997251Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.1997254Z 2025-12-04T13:38:32.1997344Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.1997347Z 2025-12-04T13:38:32.1997349Z 2025-12-04T13:38:32.1997430Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.1997523Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.1997776Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-09076c94ea287f91.xml - 2025-12-04T13:38:32.1997841Z =========================== short test summary info ============================ 2025-12-04T13:38:32.1998103Z FAILED [9.4192s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.1998155Z Traceback (most recent call last): 2025-12-04T13:38:32.1998324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.1998374Z getattr(self, test_name)() 2025-12-04T13:38:32.1998541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.1998582Z fn() 2025-12-04T13:38:32.1998745Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1998790Z method(*args, **kwargs) 2025-12-04T13:38:32.1998952Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.1998997Z method(*args, **kwargs) 2025-12-04T13:38:32.1999168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.1999213Z with policy(): 2025-12-04T13:38:32.1999374Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.1999422Z raise RuntimeError(msg) 2025-12-04T13:38:32.1999880Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 188928 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.1999887Z 2025-12-04T13:38:32.1999966Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2000221Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2000224Z 2025-12-04T13:38:32.2000316Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2000387Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2000469Z ======================= 1 failed, 32 deselected in 9.58s ======================= 2025-12-04T13:38:32.2000515Z Got exit code 1 2025-12-04T13:38:32.2000559Z Retrying single test... 2025-12-04T13:38:32.2000763Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b807e692337b1cb2.xml 2025-12-04T13:38:32.2000842Z ============================= test session starts ============================== 2025-12-04T13:38:32.2000965Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2001010Z cachedir: .pytest_cache 2025-12-04T13:38:32.2001182Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2001232Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2001279Z configfile: pytest.ini 2025-12-04T13:38:32.2001455Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2001537Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2001783Z stepcurrent: skipping 20 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2001850Z Running 1 items in this shard 2025-12-04T13:38:32.2001853Z 2025-12-04T13:38:32.2002189Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda I1204 13:31:15.477000 421635 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 421704 2025-12-04T13:38:32.2002355Z I1204 13:31:15.478000 421635 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 421705 2025-12-04T13:38:32.2002519Z I1204 13:31:15.478000 421635 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 421706 2025-12-04T13:38:32.2002681Z I1204 13:31:15.479000 421635 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 421707 2025-12-04T13:38:32.2002988Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2003040Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.2003670Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2003714Z _warn_cpu_init() 2025-12-04T13:38:32.2004019Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2004117Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.2004414Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2004466Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.2004762Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2004814Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.2005122Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/wrap.py:485: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2005190Z return wrapper_cls(module, **kwargs) 2025-12-04T13:38:32.2005809Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2005849Z _warn_cpu_init() 2025-12-04T13:38:32.2006453Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2006505Z _warn_cpu_init() 2025-12-04T13:38:32.2007108Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2007152Z _warn_cpu_init() 2025-12-04T13:38:32.2007460Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2007556Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.2007859Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2007953Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.2008265Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:532: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2008358Z fsdp_model = FSDP(model, auto_wrap_policy=always_wrap_policy, **fsdp_kwargs) 2025-12-04T13:38:32.2008603Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2008653Z return func(*args, **kwargs) 2025-12-04T13:38:32.2008900Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2008947Z return func(*args, **kwargs) 2025-12-04T13:38:32.2009192Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2009238Z return func(*args, **kwargs) 2025-12-04T13:38:32.2009480Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2009525Z return func(*args, **kwargs) 2025-12-04T13:38:32.2009843Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2009902Z return func(*args, **kwargs) 2025-12-04T13:38:32.2010145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2010189Z return func(*args, **kwargs) 2025-12-04T13:38:32.2010430Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2010474Z return func(*args, **kwargs) 2025-12-04T13:38:32.2010716Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2010761Z return func(*args, **kwargs) 2025-12-04T13:38:32.2011081Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2011145Z return func(*args, **kwargs) 2025-12-04T13:38:32.2012529Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.2012671Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.2014049Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.2014186Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.2015554Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.2015702Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.2017066Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.) 2025-12-04T13:38:32.2017210Z return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass 2025-12-04T13:38:32.2017367Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2017547Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2017866Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2018038Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2018360Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2018499Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2018802Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2018969Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2019273Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2019435Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2019796Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2019945Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2020272Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2020438Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2020979Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 176640 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.2021108Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2021322Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2021744Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2021868Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2022104Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2022288Z [rank1]:E1204 13:31:23.001000 421705 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2022332Z dist init r=1, world=4 2025-12-04T13:38:32.2022485Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2022660Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2022994Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2023162Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2023481Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2023615Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2023921Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2024086Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2024386Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2024567Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2024868Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2025034Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2025339Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2025504Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2026038Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 172544 on device 0. CUDA driver allocated memory was 2453667840 and is now 4011851776. 2025-12-04T13:38:32.2026175Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2026392Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2026793Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2026919Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2027149Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2027335Z [rank0]:E1204 13:31:23.014000 421704 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2027378Z dist init r=0, world=4 2025-12-04T13:38:32.2027530Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2027717Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2028033Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2028204Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2028514Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2028650Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2028951Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2029125Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2029428Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2029619Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2029922Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2030069Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2030374Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2030535Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2031081Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 172544 on device 3. CUDA driver allocated memory was 2250244096 and is now 3808428032. 2025-12-04T13:38:32.2031207Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2031419Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2031815Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2031939Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2032172Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2032366Z [rank3]:E1204 13:31:23.058000 421707 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2032411Z dist init r=3, world=4 2025-12-04T13:38:32.2032560Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2032737Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2033052Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2033220Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2033531Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2033663Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2033979Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2034154Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2034454Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2034617Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2034919Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2035069Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2035370Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2035545Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2036068Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 188928 on device 2. CUDA driver allocated memory was 2300575744 and is now 3858759680. 2025-12-04T13:38:32.2036195Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2036409Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2036810Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2036947Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2037175Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2037357Z [rank2]:E1204 13:31:23.078000 421706 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2037399Z dist init r=2, world=4 2025-12-04T13:38:32.2037767Z [rank0]:[W1204 13:31:23.200657596 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2037814Z FAILED [9.4200s] [100%] 2025-12-04T13:38:32.2037816Z 2025-12-04T13:38:32.2037878Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2037997Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda _ 2025-12-04T13:38:32.2038047Z Traceback (most recent call last): 2025-12-04T13:38:32.2038228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2038287Z self._join_processes(fn) 2025-12-04T13:38:32.2038479Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2038552Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2038748Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2038797Z raise RuntimeError(error) 2025-12-04T13:38:32.2038886Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2038935Z Traceback (most recent call last): 2025-12-04T13:38:32.2039116Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2039162Z getattr(self, test_name)() 2025-12-04T13:38:32.2039338Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2039376Z fn() 2025-12-04T13:38:32.2039547Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2039647Z method(*args, **kwargs) 2025-12-04T13:38:32.2039817Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2039862Z method(*args, **kwargs) 2025-12-04T13:38:32.2040031Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2040071Z with policy(): 2025-12-04T13:38:32.2040245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2040292Z raise RuntimeError(msg) 2025-12-04T13:38:32.2040687Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 176640 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.2040691Z 2025-12-04T13:38:32.2040777Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2041040Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2041043Z 2025-12-04T13:38:32.2041142Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2041166Z 2025-12-04T13:38:32.2041168Z 2025-12-04T13:38:32.2041252Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2041350Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2041604Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b807e692337b1cb2.xml - 2025-12-04T13:38:32.2041675Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2041957Z FAILED [9.4200s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2042007Z Traceback (most recent call last): 2025-12-04T13:38:32.2042186Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2042234Z getattr(self, test_name)() 2025-12-04T13:38:32.2042411Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2042449Z fn() 2025-12-04T13:38:32.2042633Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2042677Z method(*args, **kwargs) 2025-12-04T13:38:32.2042845Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2042903Z method(*args, **kwargs) 2025-12-04T13:38:32.2043072Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2043112Z with policy(): 2025-12-04T13:38:32.2043281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2043328Z raise RuntimeError(msg) 2025-12-04T13:38:32.2043727Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 176640 on device 1. CUDA driver allocated memory was 2317352960 and is now 3875536896. 2025-12-04T13:38:32.2043729Z 2025-12-04T13:38:32.2043810Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2044075Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2044091Z 2025-12-04T13:38:32.2044188Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2044257Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2044329Z ======================= 1 failed, 32 deselected in 9.58s ======================= 2025-12-04T13:38:32.2044370Z Got exit code 1 2025-12-04T13:38:32.2044577Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda 2025-12-04T13:38:32.2044716Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2044923Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-65f9ef345ca7c3f2.xml 2025-12-04T13:38:32.2044987Z ============================= test session starts ============================== 2025-12-04T13:38:32.2045116Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2045161Z cachedir: .pytest_cache 2025-12-04T13:38:32.2045337Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2045399Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2045448Z configfile: pytest.ini 2025-12-04T13:38:32.2045625Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2045709Z collecting ... collected 60 items / 21 deselected / 39 selected 2025-12-04T13:38:32.2045767Z stepcurrent: skipping 21 already run items. 2025-12-04T13:38:32.2045817Z Running 12 items in this shard 2025-12-04T13:38:32.2045820Z 2025-12-04T13:38:32.2046173Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:31:27.411000 422037 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 422106 2025-12-04T13:38:32.2046341Z I1204 13:31:27.411000 422037 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 422107 2025-12-04T13:38:32.2046513Z I1204 13:31:27.412000 422037 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 422108 2025-12-04T13:38:32.2046676Z I1204 13:31:27.412000 422037 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 422109 2025-12-04T13:38:32.2047317Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2047370Z _warn_cpu_init() 2025-12-04T13:38:32.2047995Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2048039Z _warn_cpu_init() 2025-12-04T13:38:32.2048651Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2048708Z _warn_cpu_init() 2025-12-04T13:38:32.2049324Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2049369Z _warn_cpu_init() 2025-12-04T13:38:32.2049736Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2049784Z return func(*args, **kwargs) 2025-12-04T13:38:32.2049941Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2050131Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2050451Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2050617Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2050929Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2051066Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2051370Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2051534Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2051847Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2052023Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2052321Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2052473Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2052775Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2052940Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2053479Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2053619Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2053833Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2054238Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2054366Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2054600Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2054789Z [rank0]:E1204 13:31:35.106000 422106 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2054836Z dist init r=0, world=4 2025-12-04T13:38:32.2054983Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2055160Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2055473Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2055645Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2055956Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2056093Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2056408Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2056568Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2056879Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2057038Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2057339Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2057486Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2057789Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2057970Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2058504Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:38:32.2058631Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2058841Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2059252Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2059375Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2059663Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2059846Z [rank2]:E1204 13:31:35.108000 422108 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2059888Z dist init r=2, world=4 2025-12-04T13:38:32.2060039Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2060213Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2060530Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2060696Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2061018Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2061153Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2061469Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2061630Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2061928Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2062090Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2062390Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2062553Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2062852Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2063017Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2063552Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2063676Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2063891Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2064304Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2064430Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2064658Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2064839Z [rank3]:E1204 13:31:35.118000 422109 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2064886Z dist init r=3, world=4 2025-12-04T13:38:32.2065034Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2065211Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2065527Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2065710Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2066016Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2066164Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2066464Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2066625Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2066927Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2067086Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2067397Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2067544Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2067847Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2068009Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2068540Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:38:32.2068678Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2068888Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2069291Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2069415Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2069675Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2069853Z [rank1]:E1204 13:31:35.161000 422107 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2069899Z dist init r=1, world=4 2025-12-04T13:38:32.2070280Z [rank0]:[W1204 13:31:35.273776708 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2070324Z FAILED [9.5204s] [ 8%] 2025-12-04T13:38:32.2070326Z 2025-12-04T13:38:32.2070389Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2070526Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2070580Z Traceback (most recent call last): 2025-12-04T13:38:32.2070757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2070807Z self._join_processes(fn) 2025-12-04T13:38:32.2070994Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2071056Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2071250Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2071301Z raise RuntimeError(error) 2025-12-04T13:38:32.2071387Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2071442Z Traceback (most recent call last): 2025-12-04T13:38:32.2071630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2071678Z getattr(self, test_name)() 2025-12-04T13:38:32.2071851Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2071891Z fn() 2025-12-04T13:38:32.2072058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2072106Z method(*args, **kwargs) 2025-12-04T13:38:32.2072271Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2072318Z method(*args, **kwargs) 2025-12-04T13:38:32.2072485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2072528Z with policy(): 2025-12-04T13:38:32.2072699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2072744Z raise RuntimeError(msg) 2025-12-04T13:38:32.2073163Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2073165Z 2025-12-04T13:38:32.2073248Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2073520Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2073522Z 2025-12-04T13:38:32.2073619Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2073622Z 2025-12-04T13:38:32.2073624Z 2025-12-04T13:38:32.2073708Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2073805Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2074060Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-65f9ef345ca7c3f2.xml - 2025-12-04T13:38:32.2074128Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2074423Z FAILED [9.5204s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2074476Z Traceback (most recent call last): 2025-12-04T13:38:32.2074653Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2074722Z getattr(self, test_name)() 2025-12-04T13:38:32.2074895Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2074935Z fn() 2025-12-04T13:38:32.2075099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2075144Z method(*args, **kwargs) 2025-12-04T13:38:32.2075308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2075352Z method(*args, **kwargs) 2025-12-04T13:38:32.2075518Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2075560Z with policy(): 2025-12-04T13:38:32.2075725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2075785Z raise RuntimeError(msg) 2025-12-04T13:38:32.2076178Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2076183Z 2025-12-04T13:38:32.2076263Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2076529Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2076532Z 2025-12-04T13:38:32.2076624Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2076693Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2076760Z ======================= 1 failed, 21 deselected in 9.68s ======================= 2025-12-04T13:38:32.2076801Z Got exit code 1 2025-12-04T13:38:32.2076843Z Retrying single test... 2025-12-04T13:38:32.2077049Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3b513d8e2c5d71c6.xml 2025-12-04T13:38:32.2077121Z ============================= test session starts ============================== 2025-12-04T13:38:32.2077246Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2077289Z cachedir: .pytest_cache 2025-12-04T13:38:32.2077462Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2077511Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2077556Z configfile: pytest.ini 2025-12-04T13:38:32.2077731Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2077813Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2078074Z stepcurrent: skipping 21 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2078122Z Running 1 items in this shard 2025-12-04T13:38:32.2078126Z 2025-12-04T13:38:32.2078477Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:31:39.430000 422439 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 422508 2025-12-04T13:38:32.2078654Z I1204 13:31:39.431000 422439 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 422509 2025-12-04T13:38:32.2078820Z I1204 13:31:39.432000 422439 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 422510 2025-12-04T13:38:32.2078993Z I1204 13:31:39.432000 422439 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 422511 2025-12-04T13:38:32.2079707Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2079749Z _warn_cpu_init() 2025-12-04T13:38:32.2080366Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2080423Z _warn_cpu_init() 2025-12-04T13:38:32.2081040Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2081082Z _warn_cpu_init() 2025-12-04T13:38:32.2081699Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2081739Z _warn_cpu_init() 2025-12-04T13:38:32.2082068Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2082114Z return func(*args, **kwargs) 2025-12-04T13:38:32.2082270Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2082444Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2082760Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2082927Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2083240Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2083394Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2083695Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2083870Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2084168Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2084329Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2084626Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2084774Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2085086Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2085244Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2085778Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:38:32.2085901Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2086114Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2086529Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2086653Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2086883Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2087058Z [rank1]:E1204 13:31:47.225000 422509 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2087104Z dist init r=1, world=4 2025-12-04T13:38:32.2087251Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2087424Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2087732Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2087899Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2088221Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2088368Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2088670Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2088828Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2089128Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2089285Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2089614Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2089774Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2090075Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2090235Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2090764Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 68096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2090890Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2091113Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2091516Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2091639Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2091869Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2092047Z [rank3]:E1204 13:31:47.231000 422511 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2092087Z dist init r=3, world=4 2025-12-04T13:38:32.2092237Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2092408Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2092740Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2092906Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2093227Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2093359Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2093659Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2093819Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2094114Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2094286Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2094584Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2094732Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2095031Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2095193Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2095738Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2095861Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2096075Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2096477Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2096600Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2096829Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2097004Z [rank0]:E1204 13:31:47.238000 422508 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2097047Z dist init r=0, world=4 2025-12-04T13:38:32.2097206Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2097381Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2097702Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2097867Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2098173Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2098308Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2098611Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2098782Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2099081Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2099239Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2099537Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2099719Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2100021Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2100182Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2100730Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:38:32.2100853Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2101065Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2101469Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2101590Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2101836Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2102014Z [rank2]:E1204 13:31:47.252000 422510 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2102071Z dist init r=2, world=4 2025-12-04T13:38:32.2102434Z [rank0]:[W1204 13:31:47.473401276 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2102475Z FAILED [9.6209s] [100%] 2025-12-04T13:38:32.2102478Z 2025-12-04T13:38:32.2102543Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2102663Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2102713Z Traceback (most recent call last): 2025-12-04T13:38:32.2102890Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2102937Z self._join_processes(fn) 2025-12-04T13:38:32.2103122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2103202Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2103393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2103442Z raise RuntimeError(error) 2025-12-04T13:38:32.2103525Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2103577Z Traceback (most recent call last): 2025-12-04T13:38:32.2103751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2103799Z getattr(self, test_name)() 2025-12-04T13:38:32.2103971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2104010Z fn() 2025-12-04T13:38:32.2104175Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2104222Z method(*args, **kwargs) 2025-12-04T13:38:32.2104386Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2104428Z method(*args, **kwargs) 2025-12-04T13:38:32.2104603Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2104643Z with policy(): 2025-12-04T13:38:32.2104810Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2104853Z raise RuntimeError(msg) 2025-12-04T13:38:32.2105248Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 68096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2105252Z 2025-12-04T13:38:32.2105333Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2105601Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2105603Z 2025-12-04T13:38:32.2105698Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2105702Z 2025-12-04T13:38:32.2105704Z 2025-12-04T13:38:32.2105786Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2105892Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2106146Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3b513d8e2c5d71c6.xml - 2025-12-04T13:38:32.2106225Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2106507Z FAILED [9.6209s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2106558Z Traceback (most recent call last): 2025-12-04T13:38:32.2106737Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2106784Z getattr(self, test_name)() 2025-12-04T13:38:32.2106957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2106997Z fn() 2025-12-04T13:38:32.2107162Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2107208Z method(*args, **kwargs) 2025-12-04T13:38:32.2107383Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2107427Z method(*args, **kwargs) 2025-12-04T13:38:32.2107591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2107632Z with policy(): 2025-12-04T13:38:32.2107797Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2107842Z raise RuntimeError(msg) 2025-12-04T13:38:32.2108242Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 68096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2108245Z 2025-12-04T13:38:32.2108325Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2108598Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2108600Z 2025-12-04T13:38:32.2108694Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2108783Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2108849Z ======================= 1 failed, 32 deselected in 9.78s ======================= 2025-12-04T13:38:32.2108891Z Got exit code 1 2025-12-04T13:38:32.2108933Z Retrying single test... 2025-12-04T13:38:32.2109139Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-aa44c67e35bd4a1b.xml 2025-12-04T13:38:32.2109201Z ============================= test session starts ============================== 2025-12-04T13:38:32.2109328Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2109373Z cachedir: .pytest_cache 2025-12-04T13:38:32.2109544Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2109641Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2109687Z configfile: pytest.ini 2025-12-04T13:38:32.2109864Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2109943Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2110229Z stepcurrent: skipping 21 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2110275Z Running 1 items in this shard 2025-12-04T13:38:32.2110294Z 2025-12-04T13:38:32.2110641Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:31:51.615000 422841 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 422910 2025-12-04T13:38:32.2110808Z I1204 13:31:51.616000 422841 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 422911 2025-12-04T13:38:32.2110977Z I1204 13:31:51.616000 422841 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 422912 2025-12-04T13:38:32.2111139Z I1204 13:31:51.617000 422841 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 422913 2025-12-04T13:38:32.2111957Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2112019Z _warn_cpu_init() 2025-12-04T13:38:32.2112638Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2112681Z _warn_cpu_init() 2025-12-04T13:38:32.2113293Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2113336Z _warn_cpu_init() 2025-12-04T13:38:32.2113979Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2114018Z _warn_cpu_init() 2025-12-04T13:38:32.2114333Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2114380Z return func(*args, **kwargs) 2025-12-04T13:38:32.2114534Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2114709Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2115027Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2115206Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2115514Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2115661Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2115960Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2116123Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2116423Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2116584Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2116891Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2117039Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2117338Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2117498Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2118030Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:38:32.2118155Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2118382Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2118787Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2118909Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2119138Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2119313Z [rank2]:E1204 13:31:59.351000 422912 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2119462Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2119699Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2120024Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2120204Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2120511Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2120645Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2120942Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2121102Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2121402Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2121583Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2121882Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2122031Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2122332Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2122491Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2123038Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2123160Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2123374Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2123780Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2123901Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2124131Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2124308Z [rank3]:E1204 13:31:59.351000 422913 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2124361Z dist init r=3, world=4 2025-12-04T13:38:32.2124402Z dist init r=2, world=4 2025-12-04T13:38:32.2124551Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2124734Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2125044Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2125210Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2125522Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2125656Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2125957Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2126138Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2126441Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2126601Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2126902Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2127050Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2127352Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2127523Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2128052Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:38:32.2128176Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2128388Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2128788Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2128911Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2129152Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2129337Z [rank1]:E1204 13:31:59.392000 422911 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2129381Z dist init r=1, world=4 2025-12-04T13:38:32.2129527Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2129793Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2130106Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2130274Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2130580Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2130730Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2131031Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2131190Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2131494Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2131653Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2131953Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2132112Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2132411Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2132573Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2133097Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2133221Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2133430Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2133846Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2133982Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2134212Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2134390Z [rank0]:E1204 13:31:59.411000 422910 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2134432Z dist init r=0, world=4 2025-12-04T13:38:32.2134796Z [rank0]:[W1204 13:31:59.679826963 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2134837Z FAILED [9.6195s] [100%] 2025-12-04T13:38:32.2134839Z 2025-12-04T13:38:32.2134901Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2135033Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2135085Z Traceback (most recent call last): 2025-12-04T13:38:32.2135260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2135308Z self._join_processes(fn) 2025-12-04T13:38:32.2135495Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2135554Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2135747Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2135796Z raise RuntimeError(error) 2025-12-04T13:38:32.2135883Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2135932Z Traceback (most recent call last): 2025-12-04T13:38:32.2136110Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2136157Z getattr(self, test_name)() 2025-12-04T13:38:32.2136330Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2136366Z fn() 2025-12-04T13:38:32.2136545Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2136589Z method(*args, **kwargs) 2025-12-04T13:38:32.2136755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2136799Z method(*args, **kwargs) 2025-12-04T13:38:32.2136963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2137004Z with policy(): 2025-12-04T13:38:32.2137171Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2137215Z raise RuntimeError(msg) 2025-12-04T13:38:32.2137612Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2137615Z 2025-12-04T13:38:32.2137694Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2137972Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2137974Z 2025-12-04T13:38:32.2138070Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2138084Z 2025-12-04T13:38:32.2138086Z 2025-12-04T13:38:32.2138167Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2138263Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2141386Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-aa44c67e35bd4a1b.xml - 2025-12-04T13:38:32.2141458Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2141745Z FAILED [9.6195s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2141796Z Traceback (most recent call last): 2025-12-04T13:38:32.2141982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2142059Z getattr(self, test_name)() 2025-12-04T13:38:32.2142234Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2142275Z fn() 2025-12-04T13:38:32.2142441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2142487Z method(*args, **kwargs) 2025-12-04T13:38:32.2142651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2142695Z method(*args, **kwargs) 2025-12-04T13:38:32.2142859Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2142900Z with policy(): 2025-12-04T13:38:32.2143065Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2143113Z raise RuntimeError(msg) 2025-12-04T13:38:32.2143510Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2143512Z 2025-12-04T13:38:32.2143610Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2143875Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2143881Z 2025-12-04T13:38:32.2143975Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2144045Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2144113Z ======================= 1 failed, 32 deselected in 9.78s ======================= 2025-12-04T13:38:32.2144155Z Got exit code 1 2025-12-04T13:38:32.2144367Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2144506Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2144710Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-58ee2716c244f3f6.xml 2025-12-04T13:38:32.2144775Z ============================= test session starts ============================== 2025-12-04T13:38:32.2144915Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2144962Z cachedir: .pytest_cache 2025-12-04T13:38:32.2145132Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2145213Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2145256Z configfile: pytest.ini 2025-12-04T13:38:32.2145434Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2145513Z collecting ... collected 60 items / 22 deselected / 38 selected 2025-12-04T13:38:32.2145573Z stepcurrent: skipping 22 already run items. 2025-12-04T13:38:32.2145619Z Running 11 items in this shard 2025-12-04T13:38:32.2145622Z 2025-12-04T13:38:32.2145969Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda I1204 13:32:03.794000 423243 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 423312 2025-12-04T13:38:32.2146138Z I1204 13:32:03.795000 423243 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 423313 2025-12-04T13:38:32.2146315Z I1204 13:32:03.795000 423243 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 423314 2025-12-04T13:38:32.2146479Z I1204 13:32:03.795000 423243 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 423315 2025-12-04T13:38:32.2147116Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2147158Z _warn_cpu_init() 2025-12-04T13:38:32.2147768Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2147813Z _warn_cpu_init() 2025-12-04T13:38:32.2148145Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2148190Z return func(*args, **kwargs) 2025-12-04T13:38:32.2148805Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2148845Z _warn_cpu_init() 2025-12-04T13:38:32.2149458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2149510Z _warn_cpu_init() 2025-12-04T13:38:32.2149701Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2149893Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2150203Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2150373Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2150679Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2150817Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2151115Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2151291Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2151592Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2151749Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2152051Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2152198Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2152498Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2152670Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2153201Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 24064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2153328Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2153538Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2153944Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2154067Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2154309Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2154486Z [rank3]:E1204 13:32:11.746000 423315 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2154542Z dist init r=3, world=4 2025-12-04T13:38:32.2154690Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2154862Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2155174Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2155339Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2155647Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2155793Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2156093Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2156250Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2156551Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2156710Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2157007Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2157155Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2157465Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2157626Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2158153Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 24064 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2158277Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2158489Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2158897Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2159020Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2159260Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2159438Z [rank1]:E1204 13:32:11.748000 423313 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2159479Z dist init r=1, world=4 2025-12-04T13:38:32.2159703Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2159877Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2160192Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2160375Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2160680Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2160815Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2161113Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2161273Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2161571Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2161729Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2162040Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2162189Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2162491Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2162650Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2163177Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2163301Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2163535Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2163933Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2164071Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2164299Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2164474Z [rank2]:E1204 13:32:11.750000 423314 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2164517Z dist init r=2, world=4 2025-12-04T13:38:32.2164666Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2164845Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2165169Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2165338Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2165645Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2165779Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2166077Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2166236Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2166545Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2166706Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2167003Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2167152Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2167454Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2167616Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2168157Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2168292Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2168503Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2168901Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2169024Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2169251Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2169430Z [rank0]:E1204 13:32:11.797000 423312 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2169483Z dist init r=0, world=4 2025-12-04T13:38:32.2169897Z [rank0]:[W1204 13:32:12.085954180 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2169941Z FAILED [9.8199s] [ 9%] 2025-12-04T13:38:32.2169944Z 2025-12-04T13:38:32.2170005Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2170126Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2170176Z Traceback (most recent call last): 2025-12-04T13:38:32.2170356Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2170403Z self._join_processes(fn) 2025-12-04T13:38:32.2170591Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2170650Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2170846Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2170893Z raise RuntimeError(error) 2025-12-04T13:38:32.2171004Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2171053Z Traceback (most recent call last): 2025-12-04T13:38:32.2171228Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2171274Z getattr(self, test_name)() 2025-12-04T13:38:32.2171447Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2171486Z fn() 2025-12-04T13:38:32.2171654Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2171698Z method(*args, **kwargs) 2025-12-04T13:38:32.2171861Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2171904Z method(*args, **kwargs) 2025-12-04T13:38:32.2172071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2172111Z with policy(): 2025-12-04T13:38:32.2172281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2172338Z raise RuntimeError(msg) 2025-12-04T13:38:32.2172733Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2172750Z 2025-12-04T13:38:32.2172832Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2173095Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2173098Z 2025-12-04T13:38:32.2173195Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2173197Z 2025-12-04T13:38:32.2173260Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2173311Z Traceback (most recent call last): 2025-12-04T13:38:32.2173488Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2173535Z getattr(self, test_name)() 2025-12-04T13:38:32.2173723Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2173762Z fn() 2025-12-04T13:38:32.2173926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2173971Z method(*args, **kwargs) 2025-12-04T13:38:32.2174134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2174179Z method(*args, **kwargs) 2025-12-04T13:38:32.2174342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2174382Z with policy(): 2025-12-04T13:38:32.2174549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2174594Z raise RuntimeError(msg) 2025-12-04T13:38:32.2174987Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 24064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2174990Z 2025-12-04T13:38:32.2175068Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2175343Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2175345Z 2025-12-04T13:38:32.2175439Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2175441Z 2025-12-04T13:38:32.2175443Z 2025-12-04T13:38:32.2175526Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2175623Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2175874Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-58ee2716c244f3f6.xml - 2025-12-04T13:38:32.2175942Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2176221Z FAILED [9.8199s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2176271Z Traceback (most recent call last): 2025-12-04T13:38:32.2176459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2176506Z getattr(self, test_name)() 2025-12-04T13:38:32.2176677Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2176733Z fn() 2025-12-04T13:38:32.2176895Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2176939Z method(*args, **kwargs) 2025-12-04T13:38:32.2177103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2177149Z method(*args, **kwargs) 2025-12-04T13:38:32.2177311Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2177352Z with policy(): 2025-12-04T13:38:32.2177516Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2177562Z raise RuntimeError(msg) 2025-12-04T13:38:32.2177955Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2177973Z 2025-12-04T13:38:32.2178052Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2178315Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2178317Z 2025-12-04T13:38:32.2178408Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2178410Z 2025-12-04T13:38:32.2178474Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2178523Z Traceback (most recent call last): 2025-12-04T13:38:32.2178702Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2178748Z getattr(self, test_name)() 2025-12-04T13:38:32.2178921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2178958Z fn() 2025-12-04T13:38:32.2179122Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2179165Z method(*args, **kwargs) 2025-12-04T13:38:32.2179347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2179390Z method(*args, **kwargs) 2025-12-04T13:38:32.2179556Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2179637Z with policy(): 2025-12-04T13:38:32.2179805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2179852Z raise RuntimeError(msg) 2025-12-04T13:38:32.2180238Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 24064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2180240Z 2025-12-04T13:38:32.2180323Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2180583Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2180585Z 2025-12-04T13:38:32.2180697Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2180767Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2180851Z ======================= 1 failed, 22 deselected in 9.98s ======================= 2025-12-04T13:38:32.2180891Z Got exit code 1 2025-12-04T13:38:32.2180936Z Retrying single test... 2025-12-04T13:38:32.2181141Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b3ca00c9ebbe8c37.xml 2025-12-04T13:38:32.2181206Z ============================= test session starts ============================== 2025-12-04T13:38:32.2181330Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2181375Z cachedir: .pytest_cache 2025-12-04T13:38:32.2181546Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2181599Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2181643Z configfile: pytest.ini 2025-12-04T13:38:32.2181822Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2181918Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2182176Z stepcurrent: skipping 22 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2182225Z Running 1 items in this shard 2025-12-04T13:38:32.2182227Z 2025-12-04T13:38:32.2182570Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda I1204 13:32:16.063000 423645 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 423714 2025-12-04T13:38:32.2182740Z I1204 13:32:16.063000 423645 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 423715 2025-12-04T13:38:32.2182903Z I1204 13:32:16.064000 423645 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 423716 2025-12-04T13:38:32.2183068Z I1204 13:32:16.064000 423645 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 423717 2025-12-04T13:38:32.2183703Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2183743Z _warn_cpu_init() 2025-12-04T13:38:32.2184063Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2184108Z return func(*args, **kwargs) 2025-12-04T13:38:32.2184736Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2184774Z _warn_cpu_init() 2025-12-04T13:38:32.2185397Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2185452Z _warn_cpu_init() 2025-12-04T13:38:32.2186064Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2186105Z _warn_cpu_init() 2025-12-04T13:38:32.2186258Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2186436Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2186749Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2186930Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2187241Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2187375Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2187675Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2187836Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2188133Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2188302Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2188602Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2188751Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2189052Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2189216Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2189784Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2189924Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2190136Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2190553Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2190678Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2190904Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2191083Z [rank1]:E1204 13:32:23.966000 423715 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2191125Z dist init r=1, world=4 2025-12-04T13:38:32.2191275Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2191460Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2191773Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2191939Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2192250Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2192389Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2192687Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2192864Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2193162Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2193325Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2193623Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2193778Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2194085Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2194246Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2194794Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2194931Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2195145Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2195547Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2195670Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2195901Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2196099Z [rank3]:E1204 13:32:23.977000 423717 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2196143Z dist init r=3, world=4 2025-12-04T13:38:32.2196292Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2196468Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2196777Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2196945Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2197257Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2197392Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2197705Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2197865Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2198164Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2198323Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2198622Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2198768Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2199088Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2199250Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2199827Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2199953Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2200162Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2200567Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2200709Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2200936Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2201116Z [rank2]:E1204 13:32:23.998000 423716 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2201157Z dist init r=2, world=4 2025-12-04T13:38:32.2201307Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2201480Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2201791Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2201958Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2202281Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2202418Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2202717Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2202878Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2203175Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2203334Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2203648Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2203794Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2204111Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2204271Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2204800Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2204926Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2205135Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2205549Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2205670Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2205899Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2206076Z [rank0]:E1204 13:32:24.012000 423714 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2206118Z dist init r=0, world=4 2025-12-04T13:38:32.2206482Z [rank0]:[W1204 13:32:24.276152950 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2206525Z FAILED [9.9199s] [100%] 2025-12-04T13:38:32.2206527Z 2025-12-04T13:38:32.2206589Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2206718Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2206771Z Traceback (most recent call last): 2025-12-04T13:38:32.2206948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2206997Z self._join_processes(fn) 2025-12-04T13:38:32.2207184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2207246Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2207441Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2207488Z raise RuntimeError(error) 2025-12-04T13:38:32.2207574Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2207624Z Traceback (most recent call last): 2025-12-04T13:38:32.2207805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2207852Z getattr(self, test_name)() 2025-12-04T13:38:32.2208034Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2208074Z fn() 2025-12-04T13:38:32.2208237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2208296Z method(*args, **kwargs) 2025-12-04T13:38:32.2208460Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2208504Z method(*args, **kwargs) 2025-12-04T13:38:32.2208667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2208709Z with policy(): 2025-12-04T13:38:32.2208875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2208919Z raise RuntimeError(msg) 2025-12-04T13:38:32.2209313Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2209331Z 2025-12-04T13:38:32.2209411Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2209733Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2209736Z 2025-12-04T13:38:32.2209830Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2209833Z 2025-12-04T13:38:32.2209900Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2209948Z Traceback (most recent call last): 2025-12-04T13:38:32.2210126Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2210172Z getattr(self, test_name)() 2025-12-04T13:38:32.2210346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2210386Z fn() 2025-12-04T13:38:32.2210549Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2210592Z method(*args, **kwargs) 2025-12-04T13:38:32.2210755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2210800Z method(*args, **kwargs) 2025-12-04T13:38:32.2210978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2211020Z with policy(): 2025-12-04T13:38:32.2211184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2211231Z raise RuntimeError(msg) 2025-12-04T13:38:32.2211618Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2211622Z 2025-12-04T13:38:32.2211702Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2211965Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2211967Z 2025-12-04T13:38:32.2212064Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2212066Z 2025-12-04T13:38:32.2212068Z 2025-12-04T13:38:32.2212165Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2212259Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2212518Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b3ca00c9ebbe8c37.xml - 2025-12-04T13:38:32.2212605Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2212885Z FAILED [9.9199s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2212934Z Traceback (most recent call last): 2025-12-04T13:38:32.2213114Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2213158Z getattr(self, test_name)() 2025-12-04T13:38:32.2213334Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2213371Z fn() 2025-12-04T13:38:32.2213537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2213593Z method(*args, **kwargs) 2025-12-04T13:38:32.2213758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2213800Z method(*args, **kwargs) 2025-12-04T13:38:32.2213965Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2214005Z with policy(): 2025-12-04T13:38:32.2214172Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2214215Z raise RuntimeError(msg) 2025-12-04T13:38:32.2214609Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2214612Z 2025-12-04T13:38:32.2214693Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2214959Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2214961Z 2025-12-04T13:38:32.2215067Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2215069Z 2025-12-04T13:38:32.2215132Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2215183Z Traceback (most recent call last): 2025-12-04T13:38:32.2215359Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2215406Z getattr(self, test_name)() 2025-12-04T13:38:32.2215578Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2215618Z fn() 2025-12-04T13:38:32.2215779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2215823Z method(*args, **kwargs) 2025-12-04T13:38:32.2215984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2216029Z method(*args, **kwargs) 2025-12-04T13:38:32.2216195Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2216234Z with policy(): 2025-12-04T13:38:32.2216412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2216456Z raise RuntimeError(msg) 2025-12-04T13:38:32.2216844Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2216859Z 2025-12-04T13:38:32.2216937Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2217200Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2217203Z 2025-12-04T13:38:32.2217295Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2217366Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2217434Z ====================== 1 failed, 32 deselected in 10.08s ======================= 2025-12-04T13:38:32.2217475Z Got exit code 1 2025-12-04T13:38:32.2217519Z Retrying single test... 2025-12-04T13:38:32.2217735Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3a279b5f32a64f35.xml 2025-12-04T13:38:32.2217798Z ============================= test session starts ============================== 2025-12-04T13:38:32.2217921Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2217967Z cachedir: .pytest_cache 2025-12-04T13:38:32.2218139Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2218190Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2218233Z configfile: pytest.ini 2025-12-04T13:38:32.2218413Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2218492Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2218750Z stepcurrent: skipping 22 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2218798Z Running 1 items in this shard 2025-12-04T13:38:32.2218800Z 2025-12-04T13:38:32.2219157Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda I1204 13:32:28.446000 424047 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 424116 2025-12-04T13:38:32.2219324Z I1204 13:32:28.447000 424047 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 424117 2025-12-04T13:38:32.2219490Z I1204 13:32:28.448000 424047 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 424118 2025-12-04T13:38:32.2219688Z I1204 13:32:28.448000 424047 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 424119 2025-12-04T13:38:32.2220321Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2220363Z _warn_cpu_init() 2025-12-04T13:38:32.2220695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2220743Z return func(*args, **kwargs) 2025-12-04T13:38:32.2221356Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2221412Z _warn_cpu_init() 2025-12-04T13:38:32.2222026Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2222065Z _warn_cpu_init() 2025-12-04T13:38:32.2222682Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2222737Z _warn_cpu_init() 2025-12-04T13:38:32.2222892Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2223068Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2223389Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2223560Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2223869Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2224020Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2224318Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2224479Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2224780Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2224941Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2225244Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2225401Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2225702Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2225873Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2226404Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2226528Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2226744Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2227154Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2227288Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2227521Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2227697Z [rank2]:E1204 13:32:36.426000 424118 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2227743Z dist init r=2, world=4 2025-12-04T13:38:32.2227890Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2228064Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2228375Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2228552Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2228859Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2228993Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2229291Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2229451Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2229919Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2230078Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2230400Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2230562Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2230860Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2231021Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2231545Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2231671Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2231897Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2232296Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2232419Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2232647Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2232826Z [rank1]:E1204 13:32:36.430000 424117 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2232868Z dist init r=1, world=4 2025-12-04T13:38:32.2233016Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2233186Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2233508Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2233675Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2233981Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2234118Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2234415Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2234575Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2234883Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2235043Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2235355Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2235502Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2235802Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2235961Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2236486Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2236621Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2236832Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2237231Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2237351Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2237583Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2237759Z [rank3]:E1204 13:32:36.434000 424119 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2237801Z dist init r=3, world=4 2025-12-04T13:38:32.2237957Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2238131Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2238439Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2238607Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2238915Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2239047Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2239359Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2239518Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2239877Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2240035Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2240339Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2240487Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2240786Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2240964Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2241491Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2241615Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2241826Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2242227Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2242352Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2242593Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2242772Z [rank0]:E1204 13:32:36.485000 424116 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2242813Z dist init r=0, world=4 2025-12-04T13:38:32.2243177Z [rank0]:[W1204 13:32:36.769961927 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2243220Z FAILED [9.9200s] [100%] 2025-12-04T13:38:32.2243223Z 2025-12-04T13:38:32.2243284Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2243403Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2243455Z Traceback (most recent call last): 2025-12-04T13:38:32.2243630Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2243679Z self._join_processes(fn) 2025-12-04T13:38:32.2243878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2243939Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2244134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2244197Z raise RuntimeError(error) 2025-12-04T13:38:32.2244284Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2244332Z Traceback (most recent call last): 2025-12-04T13:38:32.2244508Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2244554Z getattr(self, test_name)() 2025-12-04T13:38:32.2244727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2244764Z fn() 2025-12-04T13:38:32.2244932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2244976Z method(*args, **kwargs) 2025-12-04T13:38:32.2245142Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2245197Z method(*args, **kwargs) 2025-12-04T13:38:32.2245364Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2245403Z with policy(): 2025-12-04T13:38:32.2245570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2245615Z raise RuntimeError(msg) 2025-12-04T13:38:32.2246010Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2246012Z 2025-12-04T13:38:32.2246093Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2246359Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2246362Z 2025-12-04T13:38:32.2246458Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2246460Z 2025-12-04T13:38:32.2246522Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2246573Z Traceback (most recent call last): 2025-12-04T13:38:32.2246764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2246811Z getattr(self, test_name)() 2025-12-04T13:38:32.2246984Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2247023Z fn() 2025-12-04T13:38:32.2247187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2247233Z method(*args, **kwargs) 2025-12-04T13:38:32.2247396Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2247440Z method(*args, **kwargs) 2025-12-04T13:38:32.2247602Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2247644Z with policy(): 2025-12-04T13:38:32.2247809Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2247854Z raise RuntimeError(msg) 2025-12-04T13:38:32.2248258Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2248274Z 2025-12-04T13:38:32.2248353Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2248616Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2248619Z 2025-12-04T13:38:32.2248712Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2248714Z 2025-12-04T13:38:32.2248716Z 2025-12-04T13:38:32.2248799Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2248891Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2249149Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3a279b5f32a64f35.xml - 2025-12-04T13:38:32.2249214Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2249510Z FAILED [9.9200s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2249560Z Traceback (most recent call last): 2025-12-04T13:38:32.2249769Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2249816Z getattr(self, test_name)() 2025-12-04T13:38:32.2249989Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2250028Z fn() 2025-12-04T13:38:32.2250194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2250238Z method(*args, **kwargs) 2025-12-04T13:38:32.2250404Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2250449Z method(*args, **kwargs) 2025-12-04T13:38:32.2250611Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2250652Z with policy(): 2025-12-04T13:38:32.2250833Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2250879Z raise RuntimeError(msg) 2025-12-04T13:38:32.2251269Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2251271Z 2025-12-04T13:38:32.2251350Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2251614Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2251618Z 2025-12-04T13:38:32.2251710Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2251712Z 2025-12-04T13:38:32.2251776Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2251825Z Traceback (most recent call last): 2025-12-04T13:38:32.2252002Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2252046Z getattr(self, test_name)() 2025-12-04T13:38:32.2252234Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2252271Z fn() 2025-12-04T13:38:32.2252450Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2252494Z method(*args, **kwargs) 2025-12-04T13:38:32.2252659Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2252700Z method(*args, **kwargs) 2025-12-04T13:38:32.2252867Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2252906Z with policy(): 2025-12-04T13:38:32.2253071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2253114Z raise RuntimeError(msg) 2025-12-04T13:38:32.2253506Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2253523Z 2025-12-04T13:38:32.2253603Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2253863Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2253865Z 2025-12-04T13:38:32.2253960Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2254029Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2254098Z ====================== 1 failed, 32 deselected in 10.08s ======================= 2025-12-04T13:38:32.2254138Z Got exit code 1 2025-12-04T13:38:32.2254349Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2254487Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2254693Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6bd2a6a86159d468.xml 2025-12-04T13:38:32.2254756Z ============================= test session starts ============================== 2025-12-04T13:38:32.2254890Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2254934Z cachedir: .pytest_cache 2025-12-04T13:38:32.2255108Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2255157Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2255203Z configfile: pytest.ini 2025-12-04T13:38:32.2255379Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2255461Z collecting ... collected 60 items / 23 deselected / 37 selected 2025-12-04T13:38:32.2255519Z stepcurrent: skipping 23 already run items. 2025-12-04T13:38:32.2255567Z Running 10 items in this shard 2025-12-04T13:38:32.2255569Z 2025-12-04T13:38:32.2255915Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda I1204 13:32:40.792000 424449 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 424518 2025-12-04T13:38:32.2256081Z I1204 13:32:40.792000 424449 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 424519 2025-12-04T13:38:32.2256257Z I1204 13:32:40.793000 424449 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 424520 2025-12-04T13:38:32.2256419Z I1204 13:32:40.793000 424449 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 424521 2025-12-04T13:38:32.2257057Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2257099Z _warn_cpu_init() 2025-12-04T13:38:32.2257715Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2257770Z _warn_cpu_init() 2025-12-04T13:38:32.2258383Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2258425Z _warn_cpu_init() 2025-12-04T13:38:32.2259030Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2259073Z _warn_cpu_init() 2025-12-04T13:38:32.2259387Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2259432Z return func(*args, **kwargs) 2025-12-04T13:38:32.2259638Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2259812Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2260125Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2260293Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2260604Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2260740Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2261053Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2261214Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2261533Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2261693Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2261992Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2262141Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2262439Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2262625Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2263154Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:38:32.2263277Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2263491Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2263889Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2264014Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2264256Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2264431Z [rank1]:E1204 13:32:48.448000 424519 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2264474Z dist init r=1, world=4 2025-12-04T13:38:32.2264620Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2264794Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2265103Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2265272Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2265581Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2265726Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2266026Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2266196Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2266494Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2266652Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2266950Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2267097Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2267409Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2267570Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2268093Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:38:32.2268217Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2268427Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2268834Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2268956Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2269186Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2269363Z [rank2]:E1204 13:32:48.454000 424520 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2269405Z dist init r=2, world=4 2025-12-04T13:38:32.2269556Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2269743Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2270054Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2270233Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2270541Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2270691Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2270989Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2271150Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2271447Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2271606Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2271917Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2272065Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2272365Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2272526Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2273052Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2273175Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2273401Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2273796Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2273917Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2274145Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2274321Z [rank3]:E1204 13:32:48.501000 424521 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2274363Z dist init r=3, world=4 2025-12-04T13:38:32.2274510Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2274681Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2275000Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2275179Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2275490Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2275623Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2275920Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2276081Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2276380Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2276548Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2276847Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2276992Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2277292Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2277450Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2277988Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2278111Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2278321Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2278719Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2278841Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2279071Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2279245Z [rank0]:E1204 13:32:48.509000 424518 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2279287Z dist init r=0, world=4 2025-12-04T13:38:32.2279710Z [rank0]:[W1204 13:32:48.836988417 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2279773Z FAILED [9.6197s] [ 10%] 2025-12-04T13:38:32.2279775Z 2025-12-04T13:38:32.2279836Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2279951Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2280001Z Traceback (most recent call last): 2025-12-04T13:38:32.2280177Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2280225Z self._join_processes(fn) 2025-12-04T13:38:32.2280412Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2280471Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2280664Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2280727Z raise RuntimeError(error) 2025-12-04T13:38:32.2280811Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2280860Z Traceback (most recent call last): 2025-12-04T13:38:32.2281033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2281079Z getattr(self, test_name)() 2025-12-04T13:38:32.2281251Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2281288Z fn() 2025-12-04T13:38:32.2281453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2281498Z method(*args, **kwargs) 2025-12-04T13:38:32.2281661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2281706Z method(*args, **kwargs) 2025-12-04T13:38:32.2281871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2281910Z with policy(): 2025-12-04T13:38:32.2282076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2282119Z raise RuntimeError(msg) 2025-12-04T13:38:32.2282523Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:38:32.2282526Z 2025-12-04T13:38:32.2282606Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2282869Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2282874Z 2025-12-04T13:38:32.2282967Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2282969Z 2025-12-04T13:38:32.2282971Z 2025-12-04T13:38:32.2283052Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2283148Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2283398Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6bd2a6a86159d468.xml - 2025-12-04T13:38:32.2283474Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2283750Z FAILED [9.6197s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2283813Z Traceback (most recent call last): 2025-12-04T13:38:32.2283988Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2284034Z getattr(self, test_name)() 2025-12-04T13:38:32.2284206Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2284245Z fn() 2025-12-04T13:38:32.2284406Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2284450Z method(*args, **kwargs) 2025-12-04T13:38:32.2284614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2284656Z method(*args, **kwargs) 2025-12-04T13:38:32.2284818Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2284871Z with policy(): 2025-12-04T13:38:32.2285036Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2285081Z raise RuntimeError(msg) 2025-12-04T13:38:32.2285476Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:38:32.2285479Z 2025-12-04T13:38:32.2285557Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2285819Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2285822Z 2025-12-04T13:38:32.2285914Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2285984Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2286049Z ======================= 1 failed, 23 deselected in 9.78s ======================= 2025-12-04T13:38:32.2286089Z Got exit code 1 2025-12-04T13:38:32.2286130Z Retrying single test... 2025-12-04T13:38:32.2286345Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d1354b33471f0cd9.xml 2025-12-04T13:38:32.2286407Z ============================= test session starts ============================== 2025-12-04T13:38:32.2286532Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2286575Z cachedir: .pytest_cache 2025-12-04T13:38:32.2286746Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2286796Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2286840Z configfile: pytest.ini 2025-12-04T13:38:32.2287013Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2287092Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2287346Z stepcurrent: skipping 23 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2287393Z Running 1 items in this shard 2025-12-04T13:38:32.2287395Z 2025-12-04T13:38:32.2287750Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda I1204 13:32:52.870000 424851 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 424920 2025-12-04T13:38:32.2287929Z I1204 13:32:52.870000 424851 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 424921 2025-12-04T13:38:32.2288094Z I1204 13:32:52.871000 424851 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 424922 2025-12-04T13:38:32.2288255Z I1204 13:32:52.871000 424851 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 424923 2025-12-04T13:38:32.2288883Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2288924Z _warn_cpu_init() 2025-12-04T13:38:32.2289540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2289648Z _warn_cpu_init() 2025-12-04T13:38:32.2290262Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2290302Z _warn_cpu_init() 2025-12-04T13:38:32.2290932Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2290971Z _warn_cpu_init() 2025-12-04T13:38:32.2291285Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2291330Z return func(*args, **kwargs) 2025-12-04T13:38:32.2291485Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2291659Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2291973Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2292139Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2292461Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2292597Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2292909Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2293070Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2293368Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2293528Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2293825Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2293987Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2294288Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2294448Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2294976Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2295101Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2295311Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2295723Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2295847Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2296078Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2296257Z [rank3]:E1204 13:33:00.498000 424923 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2296301Z dist init r=3, world=4 2025-12-04T13:38:32.2296448Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2296623Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2296932Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2297119Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2297426Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2297575Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2297877Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2298036Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2298337Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2298496Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2298807Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2298953Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2299252Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2299413Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2299982Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:38:32.2300107Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2300338Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2300743Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2300865Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2301094Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2301272Z [rank2]:E1204 13:33:00.551000 424922 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2301314Z dist init r=2, world=4 2025-12-04T13:38:32.2301462Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2301653Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2301963Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2302147Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2302455Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2302587Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2302887Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2303048Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2303365Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2303524Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2303823Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2303970Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2304269Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2304430Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2304965Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:38:32.2305086Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2305297Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2305699Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2305822Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2306050Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2306225Z [rank1]:E1204 13:33:00.571000 424921 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2306278Z dist init r=1, world=4 2025-12-04T13:38:32.2306424Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2306606Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2306916Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2307082Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2307388Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2307522Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2307822Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2307992Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2308292Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2308449Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2308746Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2308892Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2309190Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2309359Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2309933Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2310057Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2310266Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2310663Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2310783Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2311033Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2311227Z [rank0]:E1204 13:33:00.573000 424920 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2311268Z dist init r=0, world=4 2025-12-04T13:38:32.2311632Z [rank0]:[W1204 13:33:00.970686317 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2311674Z FAILED [9.5197s] [100%] 2025-12-04T13:38:32.2311676Z 2025-12-04T13:38:32.2311736Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2311850Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2311901Z Traceback (most recent call last): 2025-12-04T13:38:32.2312076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2312125Z self._join_processes(fn) 2025-12-04T13:38:32.2312325Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2312383Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2312574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2312622Z raise RuntimeError(error) 2025-12-04T13:38:32.2312707Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2312755Z Traceback (most recent call last): 2025-12-04T13:38:32.2312929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2312975Z getattr(self, test_name)() 2025-12-04T13:38:32.2313147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2313186Z fn() 2025-12-04T13:38:32.2313351Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2313395Z method(*args, **kwargs) 2025-12-04T13:38:32.2313559Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2313601Z method(*args, **kwargs) 2025-12-04T13:38:32.2313791Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2313830Z with policy(): 2025-12-04T13:38:32.2313997Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2314040Z raise RuntimeError(msg) 2025-12-04T13:38:32.2314430Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2314435Z 2025-12-04T13:38:32.2314514Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2314776Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2314779Z 2025-12-04T13:38:32.2314872Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2314876Z 2025-12-04T13:38:32.2314878Z 2025-12-04T13:38:32.2314968Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2315063Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2315312Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d1354b33471f0cd9.xml - 2025-12-04T13:38:32.2315391Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2315673Z FAILED [9.5197s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2315723Z Traceback (most recent call last): 2025-12-04T13:38:32.2315900Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2315947Z getattr(self, test_name)() 2025-12-04T13:38:32.2316119Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2316156Z fn() 2025-12-04T13:38:32.2316321Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2316376Z method(*args, **kwargs) 2025-12-04T13:38:32.2316537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2316581Z method(*args, **kwargs) 2025-12-04T13:38:32.2316743Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2316784Z with policy(): 2025-12-04T13:38:32.2316946Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2316989Z raise RuntimeError(msg) 2025-12-04T13:38:32.2317378Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2317383Z 2025-12-04T13:38:32.2317460Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2317721Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2317723Z 2025-12-04T13:38:32.2317826Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2317898Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2317963Z ======================= 1 failed, 32 deselected in 9.68s ======================= 2025-12-04T13:38:32.2318003Z Got exit code 1 2025-12-04T13:38:32.2318045Z Retrying single test... 2025-12-04T13:38:32.2318249Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-611c1cc009937671.xml 2025-12-04T13:38:32.2318312Z ============================= test session starts ============================== 2025-12-04T13:38:32.2318434Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2318476Z cachedir: .pytest_cache 2025-12-04T13:38:32.2318649Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2318698Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2318742Z configfile: pytest.ini 2025-12-04T13:38:32.2318917Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2319007Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2319261Z stepcurrent: skipping 23 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2319319Z Running 1 items in this shard 2025-12-04T13:38:32.2319322Z 2025-12-04T13:38:32.2319701Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda I1204 13:33:05.115000 425253 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 425322 2025-12-04T13:38:32.2319871Z I1204 13:33:05.116000 425253 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 425323 2025-12-04T13:38:32.2320038Z I1204 13:33:05.116000 425253 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 425324 2025-12-04T13:38:32.2320201Z I1204 13:33:05.117000 425253 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 425325 2025-12-04T13:38:32.2320832Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2320893Z _warn_cpu_init() 2025-12-04T13:38:32.2321513Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2321557Z _warn_cpu_init() 2025-12-04T13:38:32.2322170Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2322215Z _warn_cpu_init() 2025-12-04T13:38:32.2322844Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2322886Z _warn_cpu_init() 2025-12-04T13:38:32.2323207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2323254Z return func(*args, **kwargs) 2025-12-04T13:38:32.2323411Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2323587Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2323920Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2324089Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2324412Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2324549Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2324848Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2325012Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2325312Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2325487Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2325787Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2325941Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2326247Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2326407Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2326939Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2327076Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2327291Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2327691Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2327815Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2328046Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2328223Z [rank0]:E1204 13:33:12.748000 425322 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2328268Z dist init r=0, world=4 2025-12-04T13:38:32.2328417Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2328604Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2328912Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2329095Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2329405Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2329539Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2329896Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2330057Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2330376Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2330535Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2330842Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2330989Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2331291Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2331455Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2331988Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072. 2025-12-04T13:38:32.2332118Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2332330Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2332733Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2332860Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2333089Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2333283Z [rank1]:E1204 13:33:12.775000 425323 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2333325Z dist init r=1, world=4 2025-12-04T13:38:32.2333493Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2333666Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2333977Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2334143Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2334453Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2334592Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2334902Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2335064Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2335363Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2335526Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2335823Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2335975Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2336290Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2336451Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2336975Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856. 2025-12-04T13:38:32.2337100Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2337315Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2337713Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2337851Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2338083Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2338272Z [rank2]:E1204 13:33:12.809000 425324 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2338317Z dist init r=2, world=4 2025-12-04T13:38:32.2338465Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2338640Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2338950Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2339120Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2339428Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2339611Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2339915Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2340076Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2340376Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2340537Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2340838Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2341000Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2341302Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2341465Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2341989Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208. 2025-12-04T13:38:32.2342114Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2342324Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2342736Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2342872Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2343103Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2343283Z [rank3]:E1204 13:33:12.839000 425325 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2343325Z dist init r=3, world=4 2025-12-04T13:38:32.2343692Z [rank0]:[W1204 13:33:12.910619684 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2343735Z FAILED [9.4220s] [100%] 2025-12-04T13:38:32.2343738Z 2025-12-04T13:38:32.2343803Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2343935Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2343989Z Traceback (most recent call last): 2025-12-04T13:38:32.2344166Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2344217Z self._join_processes(fn) 2025-12-04T13:38:32.2344405Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2344467Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2344661Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2344712Z raise RuntimeError(error) 2025-12-04T13:38:32.2344797Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2344850Z Traceback (most recent call last): 2025-12-04T13:38:32.2345024Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2345073Z getattr(self, test_name)() 2025-12-04T13:38:32.2345246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2345295Z fn() 2025-12-04T13:38:32.2345463Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2345508Z method(*args, **kwargs) 2025-12-04T13:38:32.2345676Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2345720Z method(*args, **kwargs) 2025-12-04T13:38:32.2345886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2345930Z with policy(): 2025-12-04T13:38:32.2346099Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2346143Z raise RuntimeError(msg) 2025-12-04T13:38:32.2346535Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2346538Z 2025-12-04T13:38:32.2346621Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2346899Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2346919Z 2025-12-04T13:38:32.2347013Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2347019Z 2025-12-04T13:38:32.2347021Z 2025-12-04T13:38:32.2347102Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2347199Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2347449Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-611c1cc009937671.xml - 2025-12-04T13:38:32.2347518Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2347797Z FAILED [9.4220s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2347849Z Traceback (most recent call last): 2025-12-04T13:38:32.2348028Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2348090Z getattr(self, test_name)() 2025-12-04T13:38:32.2348262Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2348303Z fn() 2025-12-04T13:38:32.2348469Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2348516Z method(*args, **kwargs) 2025-12-04T13:38:32.2348683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2348728Z method(*args, **kwargs) 2025-12-04T13:38:32.2348891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2348935Z with policy(): 2025-12-04T13:38:32.2349103Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2349149Z raise RuntimeError(msg) 2025-12-04T13:38:32.2349547Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952. 2025-12-04T13:38:32.2349550Z 2025-12-04T13:38:32.2349675Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2349943Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2349946Z 2025-12-04T13:38:32.2350040Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2350113Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2350180Z ======================= 1 failed, 32 deselected in 9.58s ======================= 2025-12-04T13:38:32.2350223Z Got exit code 1 2025-12-04T13:38:32.2350431Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2350573Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2350776Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d437c85c7343d74e.xml 2025-12-04T13:38:32.2350842Z ============================= test session starts ============================== 2025-12-04T13:38:32.2350981Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2351026Z cachedir: .pytest_cache 2025-12-04T13:38:32.2351216Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2351268Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2351315Z configfile: pytest.ini 2025-12-04T13:38:32.2351491Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2351574Z collecting ... collected 60 items / 24 deselected / 36 selected 2025-12-04T13:38:32.2351631Z stepcurrent: skipping 24 already run items. 2025-12-04T13:38:32.2351681Z Running 9 items in this shard 2025-12-04T13:38:32.2351683Z 2025-12-04T13:38:32.2352019Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda I1204 13:33:17.235000 425655 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 425724 2025-12-04T13:38:32.2352191Z I1204 13:33:17.236000 425655 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 425725 2025-12-04T13:38:32.2352372Z I1204 13:33:17.237000 425655 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 425726 2025-12-04T13:38:32.2352537Z I1204 13:33:17.237000 425655 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 425727 2025-12-04T13:38:32.2352856Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2352915Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2353542Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2353585Z _warn_cpu_init() 2025-12-04T13:38:32.2353901Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2353968Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2354587Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2354632Z _warn_cpu_init() 2025-12-04T13:38:32.2354943Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2354999Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2355623Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2355667Z _warn_cpu_init() 2025-12-04T13:38:32.2355989Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2356078Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2356391Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2356476Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2356789Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2356867Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2357189Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2357248Z return func(*args, **kwargs) 2025-12-04T13:38:32.2357560Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2357613Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2358238Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2358282Z _warn_cpu_init() 2025-12-04T13:38:32.2358591Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2358675Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2358939Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2358989Z return func(*args, **kwargs) 2025-12-04T13:38:32.2359233Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2359282Z return func(*args, **kwargs) 2025-12-04T13:38:32.2359523Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2359618Z return func(*args, **kwargs) 2025-12-04T13:38:32.2359858Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2359907Z return func(*args, **kwargs) 2025-12-04T13:38:32.2360144Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2360193Z return func(*args, **kwargs) 2025-12-04T13:38:32.2360448Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2360506Z return func(*args, **kwargs) 2025-12-04T13:38:32.2360748Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2360793Z return func(*args, **kwargs) 2025-12-04T13:38:32.2361035Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2361079Z return func(*args, **kwargs) 2025-12-04T13:38:32.2361240Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2361418Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2361742Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2361925Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2362237Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2362375Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2362676Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2362840Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2363138Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2363313Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2363611Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2363762Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2364070Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2364233Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2364761Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.2364905Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2365120Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2365524Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2365651Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2365886Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2366064Z [rank1]:E1204 13:33:24.933000 425725 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2366109Z dist init r=1, world=4 2025-12-04T13:38:32.2366257Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2366445Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2366757Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2366928Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2367237Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2367373Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2367676Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2367835Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2368150Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2368309Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2368611Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2368759Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2369064Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2369230Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2369826Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 99840 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.2369969Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2370180Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2370574Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2370695Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2370929Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2371111Z [rank0]:E1204 13:33:24.983000 425724 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2371168Z dist init r=0, world=4 2025-12-04T13:38:32.2371321Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2371495Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2371810Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2371977Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2372288Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2372423Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2372741Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2372902Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2373200Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2373360Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2373660Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2373811Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2374110Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2374291Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2374825Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 99840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.2374952Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2375172Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2375567Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2375693Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2375935Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2376114Z [rank3]:E1204 13:33:25.019000 425727 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2376160Z dist init r=3, world=4 2025-12-04T13:38:32.2376308Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2376485Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2376798Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2376968Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2377289Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2377424Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2377723Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2377884Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2378188Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2378347Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2378650Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2378809Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2379111Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2379288Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2379858Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.2379985Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2380197Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2380587Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2380725Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2380956Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2381133Z [rank2]:E1204 13:33:25.024000 425726 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2381179Z dist init r=2, world=4 2025-12-04T13:38:32.2381545Z [rank0]:[W1204 13:33:25.248162483 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2381590Z FAILED [9.6210s] [ 11%] 2025-12-04T13:38:32.2381592Z 2025-12-04T13:38:32.2381656Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2381762Z __ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda __ 2025-12-04T13:38:32.2381838Z Traceback (most recent call last): 2025-12-04T13:38:32.2382017Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2382067Z self._join_processes(fn) 2025-12-04T13:38:32.2382255Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2382316Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2382512Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2382565Z raise RuntimeError(error) 2025-12-04T13:38:32.2382650Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2382703Z Traceback (most recent call last): 2025-12-04T13:38:32.2382878Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2382931Z getattr(self, test_name)() 2025-12-04T13:38:32.2383102Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2383143Z fn() 2025-12-04T13:38:32.2383324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2383372Z method(*args, **kwargs) 2025-12-04T13:38:32.2383552Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2383600Z method(*args, **kwargs) 2025-12-04T13:38:32.2383766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2383806Z with policy(): 2025-12-04T13:38:32.2383976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2384021Z raise RuntimeError(msg) 2025-12-04T13:38:32.2384413Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.2384415Z 2025-12-04T13:38:32.2387102Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2387364Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2387391Z 2025-12-04T13:38:32.2387486Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2387488Z 2025-12-04T13:38:32.2387490Z 2025-12-04T13:38:32.2387576Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2387672Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2387930Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d437c85c7343d74e.xml - 2025-12-04T13:38:32.2387997Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2388270Z FAILED [9.6210s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2388322Z Traceback (most recent call last): 2025-12-04T13:38:32.2388504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2388548Z getattr(self, test_name)() 2025-12-04T13:38:32.2388737Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2388774Z fn() 2025-12-04T13:38:32.2388942Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2388984Z method(*args, **kwargs) 2025-12-04T13:38:32.2389151Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2389192Z method(*args, **kwargs) 2025-12-04T13:38:32.2389357Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2389398Z with policy(): 2025-12-04T13:38:32.2389565Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2389644Z raise RuntimeError(msg) 2025-12-04T13:38:32.2390027Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.2390029Z 2025-12-04T13:38:32.2390128Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2390381Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2390400Z 2025-12-04T13:38:32.2390494Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2390561Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2390628Z ======================= 1 failed, 24 deselected in 9.78s ======================= 2025-12-04T13:38:32.2390667Z Got exit code 1 2025-12-04T13:38:32.2390710Z Retrying single test... 2025-12-04T13:38:32.2390916Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b5832596ab7fbeec.xml 2025-12-04T13:38:32.2390979Z ============================= test session starts ============================== 2025-12-04T13:38:32.2391103Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2391147Z cachedir: .pytest_cache 2025-12-04T13:38:32.2391319Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2391383Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2391428Z configfile: pytest.ini 2025-12-04T13:38:32.2391604Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2391684Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2391933Z stepcurrent: skipping 24 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2391979Z Running 1 items in this shard 2025-12-04T13:38:32.2391982Z 2025-12-04T13:38:32.2392316Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda I1204 13:33:29.524000 426057 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 426126 2025-12-04T13:38:32.2392485Z I1204 13:33:29.525000 426057 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 426127 2025-12-04T13:38:32.2392648Z I1204 13:33:29.526000 426057 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 426128 2025-12-04T13:38:32.2392810Z I1204 13:33:29.526000 426057 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 426129 2025-12-04T13:38:32.2393138Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2393196Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2393822Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2393863Z _warn_cpu_init() 2025-12-04T13:38:32.2394177Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2394230Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2394861Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2394915Z _warn_cpu_init() 2025-12-04T13:38:32.2395224Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2395310Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2395619Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2395703Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2396018Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2396079Z return func(*args, **kwargs) 2025-12-04T13:38:32.2396386Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2396440Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2397060Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2397099Z _warn_cpu_init() 2025-12-04T13:38:32.2397411Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2397462Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2398091Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2398132Z _warn_cpu_init() 2025-12-04T13:38:32.2398440Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2398523Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2398830Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2398912Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2399158Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2399219Z return func(*args, **kwargs) 2025-12-04T13:38:32.2399458Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2399519Z return func(*args, **kwargs) 2025-12-04T13:38:32.2399800Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2399846Z return func(*args, **kwargs) 2025-12-04T13:38:32.2400086Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2400131Z return func(*args, **kwargs) 2025-12-04T13:38:32.2400367Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2400412Z return func(*args, **kwargs) 2025-12-04T13:38:32.2400648Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2400728Z return func(*args, **kwargs) 2025-12-04T13:38:32.2400964Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2401007Z return func(*args, **kwargs) 2025-12-04T13:38:32.2401246Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2401288Z return func(*args, **kwargs) 2025-12-04T13:38:32.2401446Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2401619Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2401935Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2402102Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2402434Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2402571Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2402871Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2403032Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2403331Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2403491Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2403806Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2403953Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2404275Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2404436Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2404965Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.2405089Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2405305Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2405705Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2405831Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2406059Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2406235Z [rank3]:E1204 13:33:37.146000 426129 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2406278Z dist init r=3, world=4 2025-12-04T13:38:32.2406426Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2406599Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2406923Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2407090Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2407400Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2407536Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2407835Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2407995Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2408295Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2408463Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2408762Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2408920Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2409221Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2409382Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2409947Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 99840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.2410088Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2410298Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2410688Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2410809Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2411039Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2411218Z [rank2]:E1204 13:33:37.147000 426128 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2411259Z dist init r=2, world=4 2025-12-04T13:38:32.2411506Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2411696Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2412008Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2412174Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2412485Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2412618Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2412919Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2413095Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2413395Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2413573Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2413870Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2414020Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2414319Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2414482Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2415011Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 99840 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.2415134Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2415345Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2415734Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2415858Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2416085Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2416274Z [rank0]:E1204 13:33:37.202000 426126 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2416316Z dist init r=0, world=4 2025-12-04T13:38:32.2416464Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2416638Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2416948Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2417115Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2417422Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2417557Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2417865Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2418046Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2418345Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2418503Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2418800Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2418945Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2419246Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2419416Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2419981Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 103936 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.2420107Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2420316Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2420709Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2420846Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2421074Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2421250Z [rank1]:E1204 13:33:37.213000 426127 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2421292Z dist init r=1, world=4 2025-12-04T13:38:32.2421654Z [rank0]:[W1204 13:33:37.464071535 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2421698Z FAILED [9.5199s] [100%] 2025-12-04T13:38:32.2421700Z 2025-12-04T13:38:32.2421761Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2421869Z __ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda __ 2025-12-04T13:38:32.2421920Z Traceback (most recent call last): 2025-12-04T13:38:32.2422098Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2422146Z self._join_processes(fn) 2025-12-04T13:38:32.2422346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2422419Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2422614Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2422661Z raise RuntimeError(error) 2025-12-04T13:38:32.2422746Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2422795Z Traceback (most recent call last): 2025-12-04T13:38:32.2422971Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2423018Z getattr(self, test_name)() 2025-12-04T13:38:32.2423187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2423228Z fn() 2025-12-04T13:38:32.2423391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2423438Z method(*args, **kwargs) 2025-12-04T13:38:32.2423615Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2423659Z method(*args, **kwargs) 2025-12-04T13:38:32.2423821Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2423860Z with policy(): 2025-12-04T13:38:32.2424029Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2424073Z raise RuntimeError(msg) 2025-12-04T13:38:32.2424459Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 99840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.2424462Z 2025-12-04T13:38:32.2424543Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2424798Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2424800Z 2025-12-04T13:38:32.2424894Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2424896Z 2025-12-04T13:38:32.2424898Z 2025-12-04T13:38:32.2424991Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2425085Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2425341Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b5832596ab7fbeec.xml - 2025-12-04T13:38:32.2425408Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2425679Z FAILED [9.5199s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2425730Z Traceback (most recent call last): 2025-12-04T13:38:32.2425906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2425953Z getattr(self, test_name)() 2025-12-04T13:38:32.2426127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2426165Z fn() 2025-12-04T13:38:32.2426339Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2426384Z method(*args, **kwargs) 2025-12-04T13:38:32.2426546Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2426602Z method(*args, **kwargs) 2025-12-04T13:38:32.2426763Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2426804Z with policy(): 2025-12-04T13:38:32.2426967Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2427013Z raise RuntimeError(msg) 2025-12-04T13:38:32.2427394Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 99840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.2427400Z 2025-12-04T13:38:32.2427479Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2427732Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2427748Z 2025-12-04T13:38:32.2427841Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2427910Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2427974Z ======================= 1 failed, 32 deselected in 9.68s ======================= 2025-12-04T13:38:32.2428016Z Got exit code 1 2025-12-04T13:38:32.2428058Z Retrying single test... 2025-12-04T13:38:32.2428261Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7e47b01e24d8b7e2.xml 2025-12-04T13:38:32.2428323Z ============================= test session starts ============================== 2025-12-04T13:38:32.2428446Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2428491Z cachedir: .pytest_cache 2025-12-04T13:38:32.2428663Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2428712Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2428757Z configfile: pytest.ini 2025-12-04T13:38:32.2428934Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2429028Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2429274Z stepcurrent: skipping 24 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2429320Z Running 1 items in this shard 2025-12-04T13:38:32.2429323Z 2025-12-04T13:38:32.2429694Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda I1204 13:33:41.632000 426459 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 426528 2025-12-04T13:38:32.2429864Z I1204 13:33:41.632000 426459 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 426529 2025-12-04T13:38:32.2430032Z I1204 13:33:41.633000 426459 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 426530 2025-12-04T13:38:32.2430197Z I1204 13:33:41.633000 426459 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 426531 2025-12-04T13:38:32.2430515Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2430585Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2431207Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2431264Z _warn_cpu_init() 2025-12-04T13:38:32.2431577Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2431632Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2432245Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2432305Z _warn_cpu_init() 2025-12-04T13:38:32.2432615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2432698Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2433008Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2433088Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2433404Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2433449Z return func(*args, **kwargs) 2025-12-04T13:38:32.2433770Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2433823Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2434440Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2434481Z _warn_cpu_init() 2025-12-04T13:38:32.2434791Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2434844Z return FSDP(layer, group, **fsdp_kwargs) 2025-12-04T13:38:32.2435492Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2435533Z _warn_cpu_init() 2025-12-04T13:38:32.2435854Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2435936Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2436245Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead. 2025-12-04T13:38:32.2436324Z fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs) 2025-12-04T13:38:32.2436573Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2436617Z return func(*args, **kwargs) 2025-12-04T13:38:32.2436858Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2436914Z return func(*args, **kwargs) 2025-12-04T13:38:32.2437153Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2437196Z return func(*args, **kwargs) 2025-12-04T13:38:32.2437437Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned. 2025-12-04T13:38:32.2437479Z return func(*args, **kwargs) 2025-12-04T13:38:32.2437718Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2437761Z return func(*args, **kwargs) 2025-12-04T13:38:32.2438000Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2438043Z return func(*args, **kwargs) 2025-12-04T13:38:32.2438279Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2438333Z return func(*args, **kwargs) 2025-12-04T13:38:32.2438570Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned. 2025-12-04T13:38:32.2438614Z return func(*args, **kwargs) 2025-12-04T13:38:32.2438771Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2438948Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2439261Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2439429Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2439790Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2439941Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2440243Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2440422Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2440721Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2440878Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2441180Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2441328Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2441643Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2441803Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2442331Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.2442456Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2442668Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2443074Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2443196Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2443427Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2443603Z [rank2]:E1204 13:33:49.354000 426530 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2443647Z dist init r=2, world=4 2025-12-04T13:38:32.2443797Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2443969Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2444283Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2444459Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2444766Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2444913Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2445212Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2445374Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2445674Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2445832Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2446145Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2446292Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2446590Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2446750Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2447263Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360. 2025-12-04T13:38:32.2447389Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2447613Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2448004Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2448125Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2448353Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2448532Z [rank3]:E1204 13:33:49.357000 426531 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2448576Z dist init r=3, world=4 2025-12-04T13:38:32.2448725Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2448900Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2449221Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2449403Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2449759Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2449893Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2450192Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2450354Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2450655Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2450831Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2451135Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2451284Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2451582Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2451744Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2452281Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 108032 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224. 2025-12-04T13:38:32.2452408Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2452620Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2453013Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2453139Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2453369Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2453549Z [rank1]:E1204 13:33:49.416000 426529 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2453591Z dist init r=1, world=4 2025-12-04T13:38:32.2453754Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2453927Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2454255Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2454421Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2454731Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2454870Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2455167Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2455343Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2455644Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2455806Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2456103Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2456252Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2456555Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2456714Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2457244Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 112128 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104. 2025-12-04T13:38:32.2457366Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2457579Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2457968Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2458094Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2458335Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2458511Z [rank0]:E1204 13:33:49.417000 426528 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2458568Z dist init r=0, world=4 2025-12-04T13:38:32.2458932Z [rank0]:[W1204 13:33:49.691287575 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2458978Z FAILED [9.6183s] [100%] 2025-12-04T13:38:32.2458981Z 2025-12-04T13:38:32.2459042Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2459154Z __ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda __ 2025-12-04T13:38:32.2459205Z Traceback (most recent call last): 2025-12-04T13:38:32.2459387Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2459435Z self._join_processes(fn) 2025-12-04T13:38:32.2459683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2459756Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2459955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2460006Z raise RuntimeError(error) 2025-12-04T13:38:32.2460091Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2460145Z Traceback (most recent call last): 2025-12-04T13:38:32.2460319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2460368Z getattr(self, test_name)() 2025-12-04T13:38:32.2460541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2460582Z fn() 2025-12-04T13:38:32.2460746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2460796Z method(*args, **kwargs) 2025-12-04T13:38:32.2460960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2461008Z method(*args, **kwargs) 2025-12-04T13:38:32.2461170Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2461233Z with policy(): 2025-12-04T13:38:32.2461400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2461447Z raise RuntimeError(msg) 2025-12-04T13:38:32.2461828Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.2461832Z 2025-12-04T13:38:32.2461918Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2462170Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2462177Z 2025-12-04T13:38:32.2462271Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2462273Z 2025-12-04T13:38:32.2462275Z 2025-12-04T13:38:32.2462361Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2462455Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2462728Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7e47b01e24d8b7e2.xml - 2025-12-04T13:38:32.2462808Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2463086Z FAILED [9.6183s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2463136Z Traceback (most recent call last): 2025-12-04T13:38:32.2463319Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2463366Z getattr(self, test_name)() 2025-12-04T13:38:32.2463542Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2463582Z fn() 2025-12-04T13:38:32.2463748Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2463792Z method(*args, **kwargs) 2025-12-04T13:38:32.2463963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2464018Z method(*args, **kwargs) 2025-12-04T13:38:32.2464184Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2464227Z with policy(): 2025-12-04T13:38:32.2464394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2464441Z raise RuntimeError(msg) 2025-12-04T13:38:32.2464820Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 95744 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008. 2025-12-04T13:38:32.2464823Z 2025-12-04T13:38:32.2464906Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2465159Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2465163Z 2025-12-04T13:38:32.2465259Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2465327Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2465409Z ======================= 1 failed, 32 deselected in 9.78s ======================= 2025-12-04T13:38:32.2465450Z Got exit code 1 2025-12-04T13:38:32.2465650Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda 2025-12-04T13:38:32.2465789Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2465995Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-09e9a6f4547b43c6.xml 2025-12-04T13:38:32.2466063Z ============================= test session starts ============================== 2025-12-04T13:38:32.2466186Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2466234Z cachedir: .pytest_cache 2025-12-04T13:38:32.2466407Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2466461Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2466504Z configfile: pytest.ini 2025-12-04T13:38:32.2466683Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2466774Z collecting ... collected 60 items / 25 deselected / 35 selected 2025-12-04T13:38:32.2466835Z stepcurrent: skipping 25 already run items. 2025-12-04T13:38:32.2466897Z Running 8 items in this shard 2025-12-04T13:38:32.2466899Z 2025-12-04T13:38:32.2467232Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda I1204 13:33:53.811000 426861 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 426930 2025-12-04T13:38:32.2467400Z I1204 13:33:53.812000 426861 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 426931 2025-12-04T13:38:32.2467569Z I1204 13:33:53.812000 426861 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 426932 2025-12-04T13:38:32.2467732Z I1204 13:33:53.813000 426861 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 426933 2025-12-04T13:38:32.2468371Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2468435Z _warn_cpu_init() 2025-12-04T13:38:32.2469052Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2469098Z _warn_cpu_init() 2025-12-04T13:38:32.2469754Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2469801Z _warn_cpu_init() 2025-12-04T13:38:32.2470134Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2470182Z return func(*args, **kwargs) 2025-12-04T13:38:32.2470801Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2470844Z _warn_cpu_init() 2025-12-04T13:38:32.2471001Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2471177Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2471491Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2471674Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2471982Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2472134Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2472434Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2472598Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2472896Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2473058Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2473376Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2473524Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2473825Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2473986Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2474503Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2474628Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2474854Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2475242Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2475366Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2475598Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2475776Z [rank1]:E1204 13:34:01.556000 426931 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2475822Z dist init r=1, world=4 2025-12-04T13:38:32.2475970Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2476157Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2476465Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2476646Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2476956Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2477090Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2477391Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2477549Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2477861Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2478019Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2478322Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2478474Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2478773Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2478938Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2479461Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2479638Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2479850Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2480238Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2480366Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2480594Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2480776Z [rank2]:E1204 13:34:01.559000 426932 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2480831Z dist init r=2, world=4 2025-12-04T13:38:32.2480984Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2481170Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2481484Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2481651Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2481961Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2482097Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2482396Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2482570Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2482868Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2483029Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2483329Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2483480Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2483782Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2483956Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2484474Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2484598Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2484811Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2485198Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2485318Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2485559Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2485737Z [rank3]:E1204 13:34:01.567000 426933 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2485803Z dist init r=3, world=4 2025-12-04T13:38:32.2485952Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2486126Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2486435Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2486605Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2486913Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2487061Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2487362Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2487520Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2487822Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2487979Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2488283Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2488430Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2488743Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2488907Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2489417Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2489545Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2489801Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2490205Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2490330Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2490573Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2490752Z [rank0]:E1204 13:34:01.612000 426930 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2490794Z dist init r=0, world=4 2025-12-04T13:38:32.2491161Z [rank0]:[W1204 13:34:01.870736961 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2491204Z FAILED [9.6201s] [ 12%] 2025-12-04T13:38:32.2491207Z 2025-12-04T13:38:32.2491270Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2491378Z ____ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda ____ 2025-12-04T13:38:32.2491446Z Traceback (most recent call last): 2025-12-04T13:38:32.2491622Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2491673Z self._join_processes(fn) 2025-12-04T13:38:32.2491859Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2491921Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2492114Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2492164Z raise RuntimeError(error) 2025-12-04T13:38:32.2492254Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2492303Z Traceback (most recent call last): 2025-12-04T13:38:32.2492481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2492528Z getattr(self, test_name)() 2025-12-04T13:38:32.2492704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2492741Z fn() 2025-12-04T13:38:32.2492908Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2492967Z method(*args, **kwargs) 2025-12-04T13:38:32.2493135Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2493179Z method(*args, **kwargs) 2025-12-04T13:38:32.2493347Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2493387Z with policy(): 2025-12-04T13:38:32.2493557Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2493603Z raise RuntimeError(msg) 2025-12-04T13:38:32.2493982Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2493984Z 2025-12-04T13:38:32.2494066Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2494315Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2494328Z 2025-12-04T13:38:32.2494425Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2494427Z 2025-12-04T13:38:32.2494492Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2494556Z Traceback (most recent call last): 2025-12-04T13:38:32.2494732Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2494783Z getattr(self, test_name)() 2025-12-04T13:38:32.2494955Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2494996Z fn() 2025-12-04T13:38:32.2495161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2495210Z method(*args, **kwargs) 2025-12-04T13:38:32.2495375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2495421Z method(*args, **kwargs) 2025-12-04T13:38:32.2495585Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2495641Z with policy(): 2025-12-04T13:38:32.2495805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2495852Z raise RuntimeError(msg) 2025-12-04T13:38:32.2496230Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2496235Z 2025-12-04T13:38:32.2496314Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2496564Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2496566Z 2025-12-04T13:38:32.2496661Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2496664Z 2025-12-04T13:38:32.2496666Z 2025-12-04T13:38:32.2496753Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2496847Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2497114Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-09e9a6f4547b43c6.xml - 2025-12-04T13:38:32.2497180Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2497447Z FAILED [9.6201s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2497497Z Traceback (most recent call last): 2025-12-04T13:38:32.2497680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2497727Z getattr(self, test_name)() 2025-12-04T13:38:32.2497905Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2497946Z fn() 2025-12-04T13:38:32.2498112Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2498160Z method(*args, **kwargs) 2025-12-04T13:38:32.2498326Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2498372Z method(*args, **kwargs) 2025-12-04T13:38:32.2498548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2498592Z with policy(): 2025-12-04T13:38:32.2498757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2498818Z raise RuntimeError(msg) 2025-12-04T13:38:32.2499196Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2499198Z 2025-12-04T13:38:32.2499281Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2499525Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2499527Z 2025-12-04T13:38:32.2499675Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2499677Z 2025-12-04T13:38:32.2499743Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2499816Z Traceback (most recent call last): 2025-12-04T13:38:32.2499996Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2500041Z getattr(self, test_name)() 2025-12-04T13:38:32.2500220Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2500258Z fn() 2025-12-04T13:38:32.2500427Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2500470Z method(*args, **kwargs) 2025-12-04T13:38:32.2500636Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2500680Z method(*args, **kwargs) 2025-12-04T13:38:32.2500844Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2500886Z with policy(): 2025-12-04T13:38:32.2501055Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2501100Z raise RuntimeError(msg) 2025-12-04T13:38:32.2501490Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2501493Z 2025-12-04T13:38:32.2501572Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2501820Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2501823Z 2025-12-04T13:38:32.2501918Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2501988Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2502059Z ======================= 1 failed, 25 deselected in 9.78s ======================= 2025-12-04T13:38:32.2502099Z Got exit code 1 2025-12-04T13:38:32.2502147Z Retrying single test... 2025-12-04T13:38:32.2502353Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-dfa5b0c3a6ae8df1.xml 2025-12-04T13:38:32.2502419Z ============================= test session starts ============================== 2025-12-04T13:38:32.2502541Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2502613Z cachedir: .pytest_cache 2025-12-04T13:38:32.2502786Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2502856Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2502901Z configfile: pytest.ini 2025-12-04T13:38:32.2503080Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2503161Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2503405Z stepcurrent: skipping 25 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2503451Z Running 1 items in this shard 2025-12-04T13:38:32.2503454Z 2025-12-04T13:38:32.2503789Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda I1204 13:34:05.889000 427263 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 427332 2025-12-04T13:38:32.2503955Z I1204 13:34:05.889000 427263 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 427333 2025-12-04T13:38:32.2504136Z I1204 13:34:05.890000 427263 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 427334 2025-12-04T13:38:32.2504302Z I1204 13:34:05.890000 427263 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 427335 2025-12-04T13:38:32.2504928Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2504972Z _warn_cpu_init() 2025-12-04T13:38:32.2505584Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2505630Z _warn_cpu_init() 2025-12-04T13:38:32.2506258Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2506298Z _warn_cpu_init() 2025-12-04T13:38:32.2506912Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2506955Z _warn_cpu_init() 2025-12-04T13:38:32.2507272Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2507332Z return func(*args, **kwargs) 2025-12-04T13:38:32.2507486Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2507675Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2507986Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2508155Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2508461Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2508599Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2508899Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2509072Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2510976Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2511137Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2511437Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2511584Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2511885Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2512078Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2512596Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2512720Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2512935Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2513328Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2513453Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2513697Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2513873Z [rank0]:E1204 13:34:13.767000 427332 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2513931Z dist init r=0, world=4 2025-12-04T13:38:32.2514076Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2514248Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2514558Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2514724Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2515033Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2515168Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2515470Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2515701Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2516000Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2516160Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2516460Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2516609Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2516930Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2517090Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2517597Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2517720Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2517933Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2518330Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2518454Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2518694Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2518870Z [rank2]:E1204 13:34:13.772000 427334 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2518912Z dist init r=2, world=4 2025-12-04T13:38:32.2519061Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2519231Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2519540Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2519761Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2520070Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2520238Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2520538Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2520697Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2520995Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2521155Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2521467Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2521616Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2521916Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2522075Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2522589Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2522713Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2522940Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2523327Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2523463Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2523693Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2523869Z [rank3]:E1204 13:34:13.787000 427335 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2523911Z dist init r=3, world=4 2025-12-04T13:38:32.2524057Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2524231Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2524540Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2524708Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2525028Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2525162Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2525460Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2525620Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2525928Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2526087Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2526385Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2526531Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2526833Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2526995Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2527521Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2527644Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2527865Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2528250Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2528371Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2528601Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2528779Z [rank1]:E1204 13:34:13.796000 427333 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2528820Z dist init r=1, world=4 2025-12-04T13:38:32.2529181Z [rank0]:[W1204 13:34:14.033471163 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2529222Z FAILED [9.9195s] [100%] 2025-12-04T13:38:32.2529240Z 2025-12-04T13:38:32.2529303Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2529410Z ____ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda ____ 2025-12-04T13:38:32.2529461Z Traceback (most recent call last): 2025-12-04T13:38:32.2529691Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2529739Z self._join_processes(fn) 2025-12-04T13:38:32.2529926Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2529988Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2530182Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2530229Z raise RuntimeError(error) 2025-12-04T13:38:32.2530316Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2530381Z Traceback (most recent call last): 2025-12-04T13:38:32.2530555Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2530601Z getattr(self, test_name)() 2025-12-04T13:38:32.2530774Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2530811Z fn() 2025-12-04T13:38:32.2530976Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2531020Z method(*args, **kwargs) 2025-12-04T13:38:32.2531185Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2531228Z method(*args, **kwargs) 2025-12-04T13:38:32.2531393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2531433Z with policy(): 2025-12-04T13:38:32.2531601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2531643Z raise RuntimeError(msg) 2025-12-04T13:38:32.2532032Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2532049Z 2025-12-04T13:38:32.2532128Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2532374Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2532378Z 2025-12-04T13:38:32.2532470Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2532474Z 2025-12-04T13:38:32.2532476Z 2025-12-04T13:38:32.2532556Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2532651Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2532905Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-dfa5b0c3a6ae8df1.xml - 2025-12-04T13:38:32.2532975Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2533240Z FAILED [9.9195s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2533291Z Traceback (most recent call last): 2025-12-04T13:38:32.2533486Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2533534Z getattr(self, test_name)() 2025-12-04T13:38:32.2533704Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2533740Z fn() 2025-12-04T13:38:32.2533904Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2533948Z method(*args, **kwargs) 2025-12-04T13:38:32.2534113Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2534158Z method(*args, **kwargs) 2025-12-04T13:38:32.2534320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2534360Z with policy(): 2025-12-04T13:38:32.2534536Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2534581Z raise RuntimeError(msg) 2025-12-04T13:38:32.2534955Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2534958Z 2025-12-04T13:38:32.2535038Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2535286Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2535289Z 2025-12-04T13:38:32.2535382Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2535451Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2535518Z ====================== 1 failed, 32 deselected in 10.08s ======================= 2025-12-04T13:38:32.2535558Z Got exit code 1 2025-12-04T13:38:32.2535601Z Retrying single test... 2025-12-04T13:38:32.2535819Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ee85649eed6ed9a8.xml 2025-12-04T13:38:32.2535882Z ============================= test session starts ============================== 2025-12-04T13:38:32.2536006Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2536069Z cachedir: .pytest_cache 2025-12-04T13:38:32.2536238Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2536288Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2536332Z configfile: pytest.ini 2025-12-04T13:38:32.2536510Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2536588Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2536829Z stepcurrent: skipping 25 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2536876Z Running 1 items in this shard 2025-12-04T13:38:32.2536878Z 2025-12-04T13:38:32.2537208Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda I1204 13:34:18.472000 427665 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 427734 2025-12-04T13:38:32.2537375Z I1204 13:34:18.473000 427665 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 427735 2025-12-04T13:38:32.2537540Z I1204 13:34:18.473000 427665 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 427736 2025-12-04T13:38:32.2537713Z I1204 13:34:18.474000 427665 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 427737 2025-12-04T13:38:32.2538344Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2538384Z _warn_cpu_init() 2025-12-04T13:38:32.2539006Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2539050Z _warn_cpu_init() 2025-12-04T13:38:32.2539714Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2539756Z _warn_cpu_init() 2025-12-04T13:38:32.2540073Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2540118Z return func(*args, **kwargs) 2025-12-04T13:38:32.2540747Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2540800Z _warn_cpu_init() 2025-12-04T13:38:32.2540956Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2541129Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2541444Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2541614Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2541918Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2542055Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2542353Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2542529Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2542827Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2542987Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2543290Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2543439Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2543757Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2543916Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2544427Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2544552Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2544764Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2545158Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2545280Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2545523Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2545699Z [rank3]:E1204 13:34:26.322000 427737 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2545744Z dist init r=3, world=4 2025-12-04T13:38:32.2545892Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2546062Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2546371Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2546539Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2546848Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2546996Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2547292Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2547451Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2547751Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2547908Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2548217Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2548363Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2548663Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2548823Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2549334Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2549459Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2549738Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2550117Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2550255Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2550481Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2550660Z [rank2]:E1204 13:34:26.332000 427736 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2550700Z dist init r=2, world=4 2025-12-04T13:38:32.2550851Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2551022Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2551331Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2551496Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2551820Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2551954Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2552251Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2552412Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2552706Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2552878Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2553179Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2553328Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2553631Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2553788Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2554323Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2554444Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2554667Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2555046Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2555173Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2555401Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2555580Z [rank0]:E1204 13:34:26.376000 427734 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2555623Z dist init r=0, world=4 2025-12-04T13:38:32.2555772Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2555943Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2556255Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2556435Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2556741Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2556875Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2557174Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2557344Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2557643Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2557799Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2558097Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2558244Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2558543Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2558702Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2559223Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2559358Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2559614Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2559998Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2560118Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2560347Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2560524Z [rank1]:E1204 13:34:26.380000 427735 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2560564Z dist init r=1, world=4 2025-12-04T13:38:32.2560929Z [rank0]:[W1204 13:34:26.684230663 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2560985Z FAILED [9.7212s] [100%] 2025-12-04T13:38:32.2560987Z 2025-12-04T13:38:32.2561050Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2561156Z ____ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda ____ 2025-12-04T13:38:32.2561206Z Traceback (most recent call last): 2025-12-04T13:38:32.2561383Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2561432Z self._join_processes(fn) 2025-12-04T13:38:32.2561619Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2561679Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2561886Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2561935Z raise RuntimeError(error) 2025-12-04T13:38:32.2562019Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2562068Z Traceback (most recent call last): 2025-12-04T13:38:32.2562243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2562292Z getattr(self, test_name)() 2025-12-04T13:38:32.2562465Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2562500Z fn() 2025-12-04T13:38:32.2562665Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2562710Z method(*args, **kwargs) 2025-12-04T13:38:32.2562874Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2562917Z method(*args, **kwargs) 2025-12-04T13:38:32.2563081Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2563135Z with policy(): 2025-12-04T13:38:32.2563308Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2563367Z raise RuntimeError(msg) 2025-12-04T13:38:32.2563744Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2563746Z 2025-12-04T13:38:32.2563827Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2564076Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2564079Z 2025-12-04T13:38:32.2564173Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2564179Z 2025-12-04T13:38:32.2564181Z 2025-12-04T13:38:32.2564261Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2564358Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2564609Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ee85649eed6ed9a8.xml - 2025-12-04T13:38:32.2564675Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2564954Z FAILED [9.7212s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2565003Z Traceback (most recent call last): 2025-12-04T13:38:32.2565180Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2565228Z getattr(self, test_name)() 2025-12-04T13:38:32.2565398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2565437Z fn() 2025-12-04T13:38:32.2565601Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2565645Z method(*args, **kwargs) 2025-12-04T13:38:32.2565808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2565852Z method(*args, **kwargs) 2025-12-04T13:38:32.2566026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2566068Z with policy(): 2025-12-04T13:38:32.2566232Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2566277Z raise RuntimeError(msg) 2025-12-04T13:38:32.2566654Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2566658Z 2025-12-04T13:38:32.2566738Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2566985Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2566989Z 2025-12-04T13:38:32.2567081Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2567151Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2567226Z ======================= 1 failed, 32 deselected in 9.88s ======================= 2025-12-04T13:38:32.2567267Z Got exit code 1 2025-12-04T13:38:32.2567459Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda 2025-12-04T13:38:32.2567611Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2567813Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e520f518f8fb1ae8.xml 2025-12-04T13:38:32.2567878Z ============================= test session starts ============================== 2025-12-04T13:38:32.2568004Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2568049Z cachedir: .pytest_cache 2025-12-04T13:38:32.2568222Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2568271Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2568315Z configfile: pytest.ini 2025-12-04T13:38:32.2568489Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2568570Z collecting ... collected 60 items / 26 deselected / 34 selected 2025-12-04T13:38:32.2568627Z stepcurrent: skipping 26 already run items. 2025-12-04T13:38:32.2568673Z Running 7 items in this shard 2025-12-04T13:38:32.2568676Z 2025-12-04T13:38:32.2569019Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:34:30.792000 428067 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 428136 2025-12-04T13:38:32.2569198Z I1204 13:34:30.792000 428067 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 428137 2025-12-04T13:38:32.2569365Z I1204 13:34:30.793000 428067 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 428138 2025-12-04T13:38:32.2569529Z I1204 13:34:30.793000 428067 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 428139 2025-12-04T13:38:32.2570222Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2570265Z _warn_cpu_init() 2025-12-04T13:38:32.2570882Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2570922Z _warn_cpu_init() 2025-12-04T13:38:32.2571536Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2571577Z _warn_cpu_init() 2025-12-04T13:38:32.2571904Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2571952Z return func(*args, **kwargs) 2025-12-04T13:38:32.2572575Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2572617Z _warn_cpu_init() 2025-12-04T13:38:32.2572770Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2572945Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2573260Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2573430Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2573737Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2573888Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2574193Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2574350Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2574650Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2574807Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2575120Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2575269Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2575568Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2575729Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2576256Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2576382Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2576603Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2577010Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2577133Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2577360Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2577536Z [rank2]:E1204 13:34:38.681000 428138 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2577578Z dist init r=2, world=4 2025-12-04T13:38:32.2577725Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2577896Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2578206Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2578386Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2578693Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2578827Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2579123Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2579282Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2579637Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2579798Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2580093Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2580239Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2580537Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2580697Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2581241Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2581377Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2581588Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2581980Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2582104Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2582330Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2582505Z [rank3]:E1204 13:34:38.682000 428139 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2582548Z dist init r=3, world=4 2025-12-04T13:38:32.2582694Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2582865Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2583188Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2583355Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2583660Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2583793Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2584103Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2584261Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2584557Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2584714Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2585013Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2585156Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2585456Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2585628Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2586153Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2586287Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2586498Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2586892Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2587013Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2587238Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2587414Z [rank1]:E1204 13:34:38.731000 428137 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2587478Z dist init r=1, world=4 2025-12-04T13:38:32.2587625Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2587794Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2588104Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2588269Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2588579Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2588724Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2589023Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2589180Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2589478Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2589685Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2589983Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2590129Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2590439Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2590611Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2591136Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2591259Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2591469Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2591860Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2591983Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2592225Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2592400Z [rank0]:E1204 13:34:38.739000 428136 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2592441Z dist init r=0, world=4 2025-12-04T13:38:32.2592800Z [rank0]:[W1204 13:34:39.071627987 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2592843Z FAILED [9.8194s] [ 14%] 2025-12-04T13:38:32.2592845Z 2025-12-04T13:38:32.2592903Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2593018Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2593068Z Traceback (most recent call last): 2025-12-04T13:38:32.2593260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2593305Z self._join_processes(fn) 2025-12-04T13:38:32.2593494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2593551Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2593742Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2593789Z raise RuntimeError(error) 2025-12-04T13:38:32.2593874Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2593922Z Traceback (most recent call last): 2025-12-04T13:38:32.2594095Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2594141Z getattr(self, test_name)() 2025-12-04T13:38:32.2594313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2594350Z fn() 2025-12-04T13:38:32.2594522Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2594566Z method(*args, **kwargs) 2025-12-04T13:38:32.2594727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2594783Z method(*args, **kwargs) 2025-12-04T13:38:32.2594945Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2594984Z with policy(): 2025-12-04T13:38:32.2595147Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2595193Z raise RuntimeError(msg) 2025-12-04T13:38:32.2595581Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2595584Z 2025-12-04T13:38:32.2595666Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2595927Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2595932Z 2025-12-04T13:38:32.2596025Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2596028Z 2025-12-04T13:38:32.2596091Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2596152Z Traceback (most recent call last): 2025-12-04T13:38:32.2596329Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2596373Z getattr(self, test_name)() 2025-12-04T13:38:32.2596547Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2596583Z fn() 2025-12-04T13:38:32.2596746Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2596789Z method(*args, **kwargs) 2025-12-04T13:38:32.2596951Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2596992Z method(*args, **kwargs) 2025-12-04T13:38:32.2597155Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2597194Z with policy(): 2025-12-04T13:38:32.2597369Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2597412Z raise RuntimeError(msg) 2025-12-04T13:38:32.2597798Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2597801Z 2025-12-04T13:38:32.2597879Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2598135Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2598138Z 2025-12-04T13:38:32.2598231Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2598234Z 2025-12-04T13:38:32.2598237Z 2025-12-04T13:38:32.2598316Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2598412Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2598673Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e520f518f8fb1ae8.xml - 2025-12-04T13:38:32.2598738Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2599023Z FAILED [9.8194s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2599072Z Traceback (most recent call last): 2025-12-04T13:38:32.2599248Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2599295Z getattr(self, test_name)() 2025-12-04T13:38:32.2599467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2599504Z fn() 2025-12-04T13:38:32.2599720Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2599765Z method(*args, **kwargs) 2025-12-04T13:38:32.2599929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2599972Z method(*args, **kwargs) 2025-12-04T13:38:32.2600134Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2600172Z with policy(): 2025-12-04T13:38:32.2600335Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2600395Z raise RuntimeError(msg) 2025-12-04T13:38:32.2600783Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2600785Z 2025-12-04T13:38:32.2600864Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2601123Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2601125Z 2025-12-04T13:38:32.2601216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2601218Z 2025-12-04T13:38:32.2601282Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2601329Z Traceback (most recent call last): 2025-12-04T13:38:32.2601518Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2601563Z getattr(self, test_name)() 2025-12-04T13:38:32.2601736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2601772Z fn() 2025-12-04T13:38:32.2601934Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2601978Z method(*args, **kwargs) 2025-12-04T13:38:32.2602139Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2602181Z method(*args, **kwargs) 2025-12-04T13:38:32.2602342Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2602384Z with policy(): 2025-12-04T13:38:32.2602547Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2602591Z raise RuntimeError(msg) 2025-12-04T13:38:32.2602990Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2603014Z 2025-12-04T13:38:32.2603094Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2603350Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2603354Z 2025-12-04T13:38:32.2603446Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2603516Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2603581Z ======================= 1 failed, 26 deselected in 9.96s ======================= 2025-12-04T13:38:32.2603621Z Got exit code 1 2025-12-04T13:38:32.2603663Z Retrying single test... 2025-12-04T13:38:32.2603865Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-71107c8e73731f92.xml 2025-12-04T13:38:32.2603927Z ============================= test session starts ============================== 2025-12-04T13:38:32.2604048Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2604091Z cachedir: .pytest_cache 2025-12-04T13:38:32.2604260Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2604323Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2604367Z configfile: pytest.ini 2025-12-04T13:38:32.2604542Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2604621Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2604871Z stepcurrent: skipping 26 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2604918Z Running 1 items in this shard 2025-12-04T13:38:32.2604921Z 2025-12-04T13:38:32.2605258Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:34:43.303000 428469 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 428538 2025-12-04T13:38:32.2605425Z I1204 13:34:43.304000 428469 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 428539 2025-12-04T13:38:32.2605604Z I1204 13:34:43.304000 428469 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 428540 2025-12-04T13:38:32.2605770Z I1204 13:34:43.305000 428469 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 428541 2025-12-04T13:38:32.2606398Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2606438Z _warn_cpu_init() 2025-12-04T13:38:32.2607063Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2607105Z _warn_cpu_init() 2025-12-04T13:38:32.2607716Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2607769Z _warn_cpu_init() 2025-12-04T13:38:32.2608384Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2608423Z _warn_cpu_init() 2025-12-04T13:38:32.2608741Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2608788Z return func(*args, **kwargs) 2025-12-04T13:38:32.2608942Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2609128Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2609441Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2609650Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2609960Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2610093Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2610405Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2610563Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2610863Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2611023Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2611319Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2611469Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2611780Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2611939Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2612477Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2612601Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2612813Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2613206Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2613330Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2613558Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2613751Z [rank0]:E1204 13:34:51.093000 428538 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2613791Z dist init r=0, world=4 2025-12-04T13:38:32.2613940Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2614114Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2614423Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2614589Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2614907Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2615046Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2615345Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2615504Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2615803Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2615963Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2616260Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2616417Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2616716Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2616886Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2617407Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2617532Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2617740Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2618138Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2618271Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2618503Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2618683Z [rank2]:E1204 13:34:51.095000 428540 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2618725Z dist init r=2, world=4 2025-12-04T13:38:32.2618870Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2619043Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2619354Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2619530Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2619882Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2620013Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2620315Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2620472Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2620773Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2620951Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2621246Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2621411Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2621710Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2621871Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2622392Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2622515Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2622724Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2623132Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2623254Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2623484Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2623660Z [rank3]:E1204 13:34:51.099000 428541 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2623700Z dist init r=3, world=4 2025-12-04T13:38:32.2623848Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2624032Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2624341Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2624507Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2624812Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2624944Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2625244Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2625404Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2625710Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2625880Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2626180Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2626328Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2626629Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2626788Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2627313Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2627447Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2627657Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2628051Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2628172Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2628399Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2628588Z [rank1]:E1204 13:34:51.180000 428539 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2628630Z dist init r=1, world=4 2025-12-04T13:38:32.2628991Z [rank0]:[W1204 13:34:51.269833344 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2629033Z FAILED [9.6209s] [100%] 2025-12-04T13:38:32.2629035Z 2025-12-04T13:38:32.2629093Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2629210Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2629259Z Traceback (most recent call last): 2025-12-04T13:38:32.2629433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2629481Z self._join_processes(fn) 2025-12-04T13:38:32.2629711Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2629770Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2629973Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2630020Z raise RuntimeError(error) 2025-12-04T13:38:32.2630103Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2630166Z Traceback (most recent call last): 2025-12-04T13:38:32.2630338Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2630383Z getattr(self, test_name)() 2025-12-04T13:38:32.2630553Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2630591Z fn() 2025-12-04T13:38:32.2630755Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2630798Z method(*args, **kwargs) 2025-12-04T13:38:32.2630962Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2631006Z method(*args, **kwargs) 2025-12-04T13:38:32.2631169Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2631210Z with policy(): 2025-12-04T13:38:32.2631374Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2631419Z raise RuntimeError(msg) 2025-12-04T13:38:32.2634704Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2634737Z 2025-12-04T13:38:32.2634827Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2635098Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2635100Z 2025-12-04T13:38:32.2635197Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2635200Z 2025-12-04T13:38:32.2635202Z 2025-12-04T13:38:32.2635285Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2635382Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2635652Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-71107c8e73731f92.xml - 2025-12-04T13:38:32.2635718Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2635999Z FAILED [9.6209s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2636047Z Traceback (most recent call last): 2025-12-04T13:38:32.2636226Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2636272Z getattr(self, test_name)() 2025-12-04T13:38:32.2636449Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2636486Z fn() 2025-12-04T13:38:32.2636651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2636695Z method(*args, **kwargs) 2025-12-04T13:38:32.2636858Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2636900Z method(*args, **kwargs) 2025-12-04T13:38:32.2637076Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2637115Z with policy(): 2025-12-04T13:38:32.2637282Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2637337Z raise RuntimeError(msg) 2025-12-04T13:38:32.2637726Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2637730Z 2025-12-04T13:38:32.2637811Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2638074Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2638076Z 2025-12-04T13:38:32.2638170Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2638238Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2638306Z ======================= 1 failed, 32 deselected in 9.79s ======================= 2025-12-04T13:38:32.2638345Z Got exit code 1 2025-12-04T13:38:32.2638388Z Retrying single test... 2025-12-04T13:38:32.2638593Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-60946723aa831dfe.xml 2025-12-04T13:38:32.2638669Z ============================= test session starts ============================== 2025-12-04T13:38:32.2638793Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2638837Z cachedir: .pytest_cache 2025-12-04T13:38:32.2639011Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2639061Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2639103Z configfile: pytest.ini 2025-12-04T13:38:32.2639282Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2639361Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2639659Z stepcurrent: skipping 26 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2639707Z Running 1 items in this shard 2025-12-04T13:38:32.2639709Z 2025-12-04T13:38:32.2640071Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:34:55.621000 428871 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 428940 2025-12-04T13:38:32.2640240Z I1204 13:34:55.622000 428871 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 428941 2025-12-04T13:38:32.2640402Z I1204 13:34:55.622000 428871 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 428942 2025-12-04T13:38:32.2640564Z I1204 13:34:55.623000 428871 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 428943 2025-12-04T13:38:32.2641196Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2641238Z _warn_cpu_init() 2025-12-04T13:38:32.2641872Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2641925Z _warn_cpu_init() 2025-12-04T13:38:32.2642243Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2642288Z return func(*args, **kwargs) 2025-12-04T13:38:32.2642906Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2642945Z _warn_cpu_init() 2025-12-04T13:38:32.2643556Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2643610Z _warn_cpu_init() 2025-12-04T13:38:32.2643766Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2643940Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2644254Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2644420Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2644737Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2644873Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2645177Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2645337Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2645636Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2645795Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2646103Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2646250Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2646568Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2646727Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2647254Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2647379Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2647589Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2647984Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2648120Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2648347Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2648526Z [rank3]:E1204 13:35:03.388000 428943 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2648567Z dist init r=3, world=4 2025-12-04T13:38:32.2648716Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2648887Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2649211Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2649375Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2649729Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2649862Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2650160Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2650320Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2650616Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2650789Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2651083Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2651244Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2651546Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2651707Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2652232Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280. 2025-12-04T13:38:32.2652354Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2652564Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2652971Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2653094Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2653321Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2653496Z [rank0]:E1204 13:35:03.395000 428940 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2653537Z dist init r=0, world=4 2025-12-04T13:38:32.2653683Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2653868Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2654178Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2654342Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2654647Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2654778Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2655077Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2655245Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2655544Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2655711Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2656010Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2656156Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2656458Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2656619Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2657144Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184. 2025-12-04T13:38:32.2657285Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2657492Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2657886Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2658008Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2658235Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2658424Z [rank2]:E1204 13:35:03.446000 428942 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2658464Z dist init r=2, world=4 2025-12-04T13:38:32.2658611Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2658783Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2659093Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2659257Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2659564Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2659745Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2660057Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2660229Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2660524Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2660683Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2660978Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2661124Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2661422Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2661582Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2662121Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400. 2025-12-04T13:38:32.2662243Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2662452Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2662845Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2662983Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2663210Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2663389Z [rank1]:E1204 13:35:03.448000 428941 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2663430Z dist init r=1, world=4 2025-12-04T13:38:32.2663794Z [rank0]:[W1204 13:35:03.621597869 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2663837Z FAILED [9.6206s] [100%] 2025-12-04T13:38:32.2663839Z 2025-12-04T13:38:32.2663899Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2664017Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2664066Z Traceback (most recent call last): 2025-12-04T13:38:32.2664243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2664301Z self._join_processes(fn) 2025-12-04T13:38:32.2664487Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2664558Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2664750Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2664797Z raise RuntimeError(error) 2025-12-04T13:38:32.2664883Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2664932Z Traceback (most recent call last): 2025-12-04T13:38:32.2665107Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2665152Z getattr(self, test_name)() 2025-12-04T13:38:32.2665324Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2665360Z fn() 2025-12-04T13:38:32.2665524Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2665567Z method(*args, **kwargs) 2025-12-04T13:38:32.2665731Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2665774Z method(*args, **kwargs) 2025-12-04T13:38:32.2665935Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2665989Z with policy(): 2025-12-04T13:38:32.2666154Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2666198Z raise RuntimeError(msg) 2025-12-04T13:38:32.2666591Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2666594Z 2025-12-04T13:38:32.2666675Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2666933Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2666935Z 2025-12-04T13:38:32.2667031Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2667034Z 2025-12-04T13:38:32.2667049Z 2025-12-04T13:38:32.2667132Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2667225Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2667475Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-60946723aa831dfe.xml - 2025-12-04T13:38:32.2667539Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2667815Z FAILED [9.6206s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2667863Z Traceback (most recent call last): 2025-12-04T13:38:32.2668040Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2668085Z getattr(self, test_name)() 2025-12-04T13:38:32.2668259Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2668296Z fn() 2025-12-04T13:38:32.2668474Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2668516Z method(*args, **kwargs) 2025-12-04T13:38:32.2668680Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2668735Z method(*args, **kwargs) 2025-12-04T13:38:32.2668900Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2668938Z with policy(): 2025-12-04T13:38:32.2669104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2669149Z raise RuntimeError(msg) 2025-12-04T13:38:32.2669537Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536. 2025-12-04T13:38:32.2669539Z 2025-12-04T13:38:32.2669667Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2669926Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2669929Z 2025-12-04T13:38:32.2670021Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2670088Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2670173Z ======================= 1 failed, 32 deselected in 9.78s ======================= 2025-12-04T13:38:32.2670212Z Got exit code 1 2025-12-04T13:38:32.2670413Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2670551Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2670752Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6033c1bcb9217540.xml 2025-12-04T13:38:32.2670815Z ============================= test session starts ============================== 2025-12-04T13:38:32.2670938Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2670982Z cachedir: .pytest_cache 2025-12-04T13:38:32.2671153Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2671218Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2671261Z configfile: pytest.ini 2025-12-04T13:38:32.2671441Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2671521Z collecting ... collected 60 items / 27 deselected / 33 selected 2025-12-04T13:38:32.2671578Z stepcurrent: skipping 27 already run items. 2025-12-04T13:38:32.2671622Z Running 6 items in this shard 2025-12-04T13:38:32.2671626Z 2025-12-04T13:38:32.2672018Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda I1204 13:35:07.833000 429273 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 429342 2025-12-04T13:38:32.2672183Z I1204 13:35:07.833000 429273 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 429343 2025-12-04T13:38:32.2672348Z I1204 13:35:07.834000 429273 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 429344 2025-12-04T13:38:32.2672510Z I1204 13:35:07.834000 429273 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 429345 2025-12-04T13:38:32.2673159Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2673221Z _warn_cpu_init() 2025-12-04T13:38:32.2673837Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2673880Z _warn_cpu_init() 2025-12-04T13:38:32.2674498Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2674559Z _warn_cpu_init() 2025-12-04T13:38:32.2675174Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2675212Z _warn_cpu_init() 2025-12-04T13:38:32.2675528Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2675573Z return func(*args, **kwargs) 2025-12-04T13:38:32.2675726Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2675913Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2676227Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2676395Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2676707Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2676842Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2677141Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2677300Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2677608Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2677780Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2678077Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2678227Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2678528Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2678688Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2679270Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:38:32.2679407Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2679696Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2680147Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2680270Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2680498Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2680692Z [rank0]:E1204 13:35:13.588000 429342 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2680734Z dist init r=0, world=4 2025-12-04T13:38:32.2680880Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2681053Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2681366Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2681535Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2681847Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2681980Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2682301Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2682474Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2682773Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2682931Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2683229Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2683377Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2683677Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2683836Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2684432Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T13:38:32.2684557Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2684766Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2685215Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2685350Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2685576Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2685753Z [rank1]:E1204 13:35:13.589000 429343 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2685793Z dist init r=1, world=4 2025-12-04T13:38:32.2685941Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2686111Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2686420Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2686584Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2686905Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2687051Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2687348Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2687508Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2687805Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2687964Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2688258Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2688404Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2688704Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2688878Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2689455Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:38:32.2689626Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2689857Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2690302Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2690424Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2690653Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2690828Z [rank3]:E1204 13:35:13.644000 429345 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2690871Z dist init r=3, world=4 2025-12-04T13:38:32.2691018Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2691189Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2691522Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2691706Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2692013Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2692147Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2692446Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2692604Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2692902Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2693059Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2693357Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2693522Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2693821Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2693981Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2694575Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T13:38:32.2694699Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2694909Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2695352Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2695474Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2695702Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2695878Z [rank2]:E1204 13:35:13.645000 429344 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2695917Z dist init r=2, world=4 2025-12-04T13:38:32.2696293Z [rank0]:[W1204 13:35:13.768870480 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2696346Z FAILED [7.4165s] [ 16%] 2025-12-04T13:38:32.2696349Z 2025-12-04T13:38:32.2696409Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2696570Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2696621Z Traceback (most recent call last): 2025-12-04T13:38:32.2696802Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2696850Z self._join_processes(fn) 2025-12-04T13:38:32.2697039Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2697096Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2697286Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2697333Z raise RuntimeError(error) 2025-12-04T13:38:32.2697419Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2697466Z Traceback (most recent call last): 2025-12-04T13:38:32.2697638Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2697698Z getattr(self, test_name)() 2025-12-04T13:38:32.2697869Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2697907Z fn() 2025-12-04T13:38:32.2698071Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2698118Z method(*args, **kwargs) 2025-12-04T13:38:32.2698280Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2698327Z method(*args, **kwargs) 2025-12-04T13:38:32.2698491Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2698533Z with policy(): 2025-12-04T13:38:32.2698697Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2698756Z raise RuntimeError(msg) 2025-12-04T13:38:32.2699191Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:38:32.2699195Z 2025-12-04T13:38:32.2699276Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2699640Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2699642Z 2025-12-04T13:38:32.2699736Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2699740Z 2025-12-04T13:38:32.2699742Z 2025-12-04T13:38:32.2699825Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2699920Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2700187Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6033c1bcb9217540.xml - 2025-12-04T13:38:32.2700252Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2700589Z FAILED [7.4165s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2700640Z Traceback (most recent call last): 2025-12-04T13:38:32.2700815Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2700862Z getattr(self, test_name)() 2025-12-04T13:38:32.2701036Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2701074Z fn() 2025-12-04T13:38:32.2701237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2701280Z method(*args, **kwargs) 2025-12-04T13:38:32.2701442Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2701487Z method(*args, **kwargs) 2025-12-04T13:38:32.2701648Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2701689Z with policy(): 2025-12-04T13:38:32.2701854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2701914Z raise RuntimeError(msg) 2025-12-04T13:38:32.2702348Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:38:32.2702351Z 2025-12-04T13:38:32.2702432Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2702743Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2702746Z 2025-12-04T13:38:32.2702838Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2702908Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2702991Z ======================= 1 failed, 27 deselected in 7.56s ======================= 2025-12-04T13:38:32.2703032Z Got exit code 1 2025-12-04T13:38:32.2703075Z Retrying single test... 2025-12-04T13:38:32.2703279Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4586b11fe6650479.xml 2025-12-04T13:38:32.2703341Z ============================= test session starts ============================== 2025-12-04T13:38:32.2703464Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2703508Z cachedir: .pytest_cache 2025-12-04T13:38:32.2703680Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2703728Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2703772Z configfile: pytest.ini 2025-12-04T13:38:32.2703949Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2704029Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2704344Z stepcurrent: skipping 27 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2704393Z Running 1 items in this shard 2025-12-04T13:38:32.2704395Z 2025-12-04T13:38:32.2704795Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda I1204 13:35:17.956000 429675 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 429744 2025-12-04T13:38:32.2704962Z I1204 13:35:17.957000 429675 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 429745 2025-12-04T13:38:32.2705129Z I1204 13:35:17.957000 429675 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 429746 2025-12-04T13:38:32.2705291Z I1204 13:35:17.958000 429675 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 429747 2025-12-04T13:38:32.2705928Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2705968Z _warn_cpu_init() 2025-12-04T13:38:32.2706587Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2706642Z _warn_cpu_init() 2025-12-04T13:38:32.2707255Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2707297Z _warn_cpu_init() 2025-12-04T13:38:32.2707931Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2707973Z _warn_cpu_init() 2025-12-04T13:38:32.2708289Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2708335Z return func(*args, **kwargs) 2025-12-04T13:38:32.2708487Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2708661Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2708975Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2709152Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2709463Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2709663Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2709969Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2710130Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2710433Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2710593Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2710889Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2711055Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2711353Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2711515Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2712094Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T13:38:32.2712220Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2712447Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2712893Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2713017Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2713246Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2713424Z [rank1]:E1204 13:35:23.684000 429745 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2713466Z dist init r=1, world=4 2025-12-04T13:38:32.2713613Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2713802Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2714111Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2714294Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2714602Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2714738Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2715038Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2715195Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2715494Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2715664Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2715962Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2716107Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2716408Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2716569Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2717156Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:38:32.2717282Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2717491Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2717936Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2718059Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2718287Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2718475Z [rank3]:E1204 13:35:23.687000 429747 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2718516Z dist init r=3, world=4 2025-12-04T13:38:32.2718676Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2718847Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2719159Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2719324Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2719658Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2719790Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2720091Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2720250Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2720564Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2720724Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2721020Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2721168Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2721483Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2721643Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2722216Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:38:32.2722340Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2722550Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2722992Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2723128Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2723356Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2723546Z [rank0]:E1204 13:35:23.703000 429744 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2723589Z dist init r=0, world=4 2025-12-04T13:38:32.2723736Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2723908Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2724216Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2724382Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2724690Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2724824Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2725140Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2725299Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2725597Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2725755Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2726066Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2726213Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2726512Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2726672Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2727247Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T13:38:32.2727371Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2727593Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2728035Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2728166Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2728396Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2728574Z [rank2]:E1204 13:35:23.734000 429746 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2728615Z dist init r=2, world=4 2025-12-04T13:38:32.2728977Z [rank0]:[W1204 13:35:23.909509773 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2729020Z FAILED [7.4160s] [100%] 2025-12-04T13:38:32.2729022Z 2025-12-04T13:38:32.2729082Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2729244Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2729308Z Traceback (most recent call last): 2025-12-04T13:38:32.2729484Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2729532Z self._join_processes(fn) 2025-12-04T13:38:32.2729771Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2729831Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2730024Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2730072Z raise RuntimeError(error) 2025-12-04T13:38:32.2730157Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2730207Z Traceback (most recent call last): 2025-12-04T13:38:32.2730380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2730428Z getattr(self, test_name)() 2025-12-04T13:38:32.2730616Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2730652Z fn() 2025-12-04T13:38:32.2730818Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2730861Z method(*args, **kwargs) 2025-12-04T13:38:32.2731027Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2731070Z method(*args, **kwargs) 2025-12-04T13:38:32.2731234Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2731273Z with policy(): 2025-12-04T13:38:32.2731439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2731483Z raise RuntimeError(msg) 2025-12-04T13:38:32.2731937Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:38:32.2731939Z 2025-12-04T13:38:32.2732020Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2732347Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2732349Z 2025-12-04T13:38:32.2732444Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2732446Z 2025-12-04T13:38:32.2732510Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2732560Z Traceback (most recent call last): 2025-12-04T13:38:32.2732733Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2732779Z getattr(self, test_name)() 2025-12-04T13:38:32.2732951Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2732989Z fn() 2025-12-04T13:38:32.2733152Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2733196Z method(*args, **kwargs) 2025-12-04T13:38:32.2733358Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2733401Z method(*args, **kwargs) 2025-12-04T13:38:32.2733561Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2733619Z with policy(): 2025-12-04T13:38:32.2733781Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2733826Z raise RuntimeError(msg) 2025-12-04T13:38:32.2734257Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T13:38:32.2734262Z 2025-12-04T13:38:32.2734342Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2734654Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2734657Z 2025-12-04T13:38:32.2734760Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2734762Z 2025-12-04T13:38:32.2734826Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2734873Z Traceback (most recent call last): 2025-12-04T13:38:32.2735051Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2735095Z getattr(self, test_name)() 2025-12-04T13:38:32.2735268Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2735304Z fn() 2025-12-04T13:38:32.2735467Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2735507Z method(*args, **kwargs) 2025-12-04T13:38:32.2735672Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2735714Z method(*args, **kwargs) 2025-12-04T13:38:32.2735875Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2735914Z with policy(): 2025-12-04T13:38:32.2736091Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2736135Z raise RuntimeError(msg) 2025-12-04T13:38:32.2736578Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:38:32.2736580Z 2025-12-04T13:38:32.2736662Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2736972Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2736974Z 2025-12-04T13:38:32.2737069Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2737072Z 2025-12-04T13:38:32.2737074Z 2025-12-04T13:38:32.2737153Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2737251Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2737500Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4586b11fe6650479.xml - 2025-12-04T13:38:32.2737567Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2738011Z FAILED [7.4160s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2738059Z Traceback (most recent call last): 2025-12-04T13:38:32.2738239Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2738283Z getattr(self, test_name)() 2025-12-04T13:38:32.2738458Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2738495Z fn() 2025-12-04T13:38:32.2738659Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2738701Z method(*args, **kwargs) 2025-12-04T13:38:32.2738867Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2741747Z method(*args, **kwargs) 2025-12-04T13:38:32.2741920Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2741959Z with policy(): 2025-12-04T13:38:32.2742127Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2742169Z raise RuntimeError(msg) 2025-12-04T13:38:32.2742602Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:38:32.2742606Z 2025-12-04T13:38:32.2742686Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2742993Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2742995Z 2025-12-04T13:38:32.2743088Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2743090Z 2025-12-04T13:38:32.2743174Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2743225Z Traceback (most recent call last): 2025-12-04T13:38:32.2743400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2743470Z getattr(self, test_name)() 2025-12-04T13:38:32.2743642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2743680Z fn() 2025-12-04T13:38:32.2743842Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2743886Z method(*args, **kwargs) 2025-12-04T13:38:32.2744048Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2744090Z method(*args, **kwargs) 2025-12-04T13:38:32.2744253Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2744294Z with policy(): 2025-12-04T13:38:32.2744459Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2744504Z raise RuntimeError(msg) 2025-12-04T13:38:32.2744939Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T13:38:32.2744957Z 2025-12-04T13:38:32.2745035Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2745344Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2745346Z 2025-12-04T13:38:32.2745438Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2745441Z 2025-12-04T13:38:32.2745504Z Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2745551Z Traceback (most recent call last): 2025-12-04T13:38:32.2745727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2745771Z getattr(self, test_name)() 2025-12-04T13:38:32.2745965Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2746001Z fn() 2025-12-04T13:38:32.2746165Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2746207Z method(*args, **kwargs) 2025-12-04T13:38:32.2746371Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2746414Z method(*args, **kwargs) 2025-12-04T13:38:32.2746577Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2746617Z with policy(): 2025-12-04T13:38:32.2746779Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2746824Z raise RuntimeError(msg) 2025-12-04T13:38:32.2747256Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:38:32.2747259Z 2025-12-04T13:38:32.2747348Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2747656Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2747669Z 2025-12-04T13:38:32.2747762Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2747830Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2747899Z ======================= 1 failed, 32 deselected in 7.56s ======================= 2025-12-04T13:38:32.2747939Z Got exit code 1 2025-12-04T13:38:32.2747983Z Retrying single test... 2025-12-04T13:38:32.2748189Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b6b47062a3b6547f.xml 2025-12-04T13:38:32.2748255Z ============================= test session starts ============================== 2025-12-04T13:38:32.2748378Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2748423Z cachedir: .pytest_cache 2025-12-04T13:38:32.2748595Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2748644Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2748688Z configfile: pytest.ini 2025-12-04T13:38:32.2748866Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2748959Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2749262Z stepcurrent: skipping 27 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2749311Z Running 1 items in this shard 2025-12-04T13:38:32.2749313Z 2025-12-04T13:38:32.2749726Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda I1204 13:35:28.011000 430077 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 430146 2025-12-04T13:38:32.2749896Z I1204 13:35:28.011000 430077 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 430147 2025-12-04T13:38:32.2750059Z I1204 13:35:28.012000 430077 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 430148 2025-12-04T13:38:32.2750240Z I1204 13:35:28.012000 430077 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 430149 2025-12-04T13:38:32.2750866Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2750907Z _warn_cpu_init() 2025-12-04T13:38:32.2751526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2751566Z _warn_cpu_init() 2025-12-04T13:38:32.2752200Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2752255Z _warn_cpu_init() 2025-12-04T13:38:32.2752867Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2752909Z _warn_cpu_init() 2025-12-04T13:38:32.2753224Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2753272Z return func(*args, **kwargs) 2025-12-04T13:38:32.2753424Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2753600Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2753926Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2754092Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2754399Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2754614Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2754916Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2755096Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2755405Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2755565Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2755863Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2756012Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2756312Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2756483Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2757055Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:38:32.2757192Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2757405Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2757852Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2757975Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2758203Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2758381Z [rank3]:E1204 13:35:33.758000 430149 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2758433Z dist init r=3, world=4 2025-12-04T13:38:32.2758583Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2758754Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2759065Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2759232Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2759539Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2759727Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2760027Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2760186Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2760485Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2760643Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2760943Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2761087Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2761405Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2761578Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2762147Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432. 2025-12-04T13:38:32.2762272Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2762483Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2762927Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2763048Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2763291Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2763466Z [rank2]:E1204 13:35:33.762000 430148 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2763508Z dist init r=2, world=4 2025-12-04T13:38:32.2763655Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2763828Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2764138Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2764314Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2764624Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2764755Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2765056Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2765214Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2765517Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2765674Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2765982Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2766137Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2766435Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2766596Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2767168Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648. 2025-12-04T13:38:32.2767291Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2767502Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2767959Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2768082Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2768307Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2768484Z [rank1]:E1204 13:35:33.793000 430147 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2768524Z dist init r=1, world=4 2025-12-04T13:38:32.2768672Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2768853Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2769162Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2769326Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2769672Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2769804Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2770104Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2770262Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2770575Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2770745Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2771042Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2771189Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2771491Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2771649Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2772218Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528. 2025-12-04T13:38:32.2772355Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2772564Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2773009Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2773131Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2773356Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2773544Z [rank0]:E1204 13:35:33.809000 430146 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2773586Z dist init r=0, world=4 2025-12-04T13:38:32.2773947Z [rank0]:[W1204 13:35:34.085905987 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2773988Z FAILED [7.3165s] [100%] 2025-12-04T13:38:32.2773991Z 2025-12-04T13:38:32.2774051Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2774213Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda _ 2025-12-04T13:38:32.2774262Z Traceback (most recent call last): 2025-12-04T13:38:32.2774439Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2774485Z self._join_processes(fn) 2025-12-04T13:38:32.2774671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2774728Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2774931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2774988Z raise RuntimeError(error) 2025-12-04T13:38:32.2775072Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2775119Z Traceback (most recent call last): 2025-12-04T13:38:32.2775292Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2775337Z getattr(self, test_name)() 2025-12-04T13:38:32.2775509Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2775546Z fn() 2025-12-04T13:38:32.2775707Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2775750Z method(*args, **kwargs) 2025-12-04T13:38:32.2775910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2775953Z method(*args, **kwargs) 2025-12-04T13:38:32.2776115Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2776154Z with policy(): 2025-12-04T13:38:32.2776317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2776360Z raise RuntimeError(msg) 2025-12-04T13:38:32.2776802Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:38:32.2776804Z 2025-12-04T13:38:32.2776883Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2777192Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2777196Z 2025-12-04T13:38:32.2777286Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2777289Z 2025-12-04T13:38:32.2777291Z 2025-12-04T13:38:32.2777371Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2777482Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2777735Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b6b47062a3b6547f.xml - 2025-12-04T13:38:32.2777799Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2778117Z FAILED [7.3165s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2778167Z Traceback (most recent call last): 2025-12-04T13:38:32.2778341Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2778386Z getattr(self, test_name)() 2025-12-04T13:38:32.2778556Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2778594Z fn() 2025-12-04T13:38:32.2778757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2778800Z method(*args, **kwargs) 2025-12-04T13:38:32.2778973Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2779016Z method(*args, **kwargs) 2025-12-04T13:38:32.2779177Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2779228Z with policy(): 2025-12-04T13:38:32.2779391Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2779434Z raise RuntimeError(msg) 2025-12-04T13:38:32.2779920Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784. 2025-12-04T13:38:32.2779923Z 2025-12-04T13:38:32.2780001Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2780311Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2780314Z 2025-12-04T13:38:32.2780406Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2780474Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2780538Z ======================= 1 failed, 32 deselected in 7.47s ======================= 2025-12-04T13:38:32.2780594Z Got exit code 1 2025-12-04T13:38:32.2780843Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda 2025-12-04T13:38:32.2780980Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2781182Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c0ec8b517b4caa4f.xml 2025-12-04T13:38:32.2781245Z ============================= test session starts ============================== 2025-12-04T13:38:32.2781364Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2781407Z cachedir: .pytest_cache 2025-12-04T13:38:32.2781576Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2781626Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2781683Z configfile: pytest.ini 2025-12-04T13:38:32.2781856Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2781934Z collecting ... collected 60 items / 28 deselected / 32 selected 2025-12-04T13:38:32.2781990Z stepcurrent: skipping 28 already run items. 2025-12-04T13:38:32.2782035Z Running 5 items in this shard 2025-12-04T13:38:32.2782038Z 2025-12-04T13:38:32.2782411Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda I1204 13:35:37.969000 430479 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 430548 2025-12-04T13:38:32.2782578Z I1204 13:35:37.970000 430479 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 430549 2025-12-04T13:38:32.2782742Z I1204 13:35:37.970000 430479 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 430550 2025-12-04T13:38:32.2782905Z I1204 13:35:37.971000 430479 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 430551 2025-12-04T13:38:32.2783550Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2783605Z _warn_cpu_init() 2025-12-04T13:38:32.2784217Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2784256Z _warn_cpu_init() 2025-12-04T13:38:32.2784865Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2784904Z _warn_cpu_init() 2025-12-04T13:38:32.2785219Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2785278Z return func(*args, **kwargs) 2025-12-04T13:38:32.2785893Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2785932Z _warn_cpu_init() 2025-12-04T13:38:32.2786084Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2786256Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2786583Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2786749Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2787053Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2787186Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2787483Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2787641Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2787952Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2788107Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2788415Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2788560Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2788856Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2789014Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2789567Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2789759Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2789982Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2790407Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2790528Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2790748Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2790920Z [rank0]:E1204 13:35:43.954000 430548 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2790973Z dist init r=0, world=4 2025-12-04T13:38:32.2791117Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2791285Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2791587Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2791747Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2792045Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2792176Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2792479Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2792634Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2792939Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2793094Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2793383Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2793526Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2793819Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2793973Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2794515Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T13:38:32.2794657Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2794863Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2795287Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2795406Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2795641Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2795814Z [rank2]:E1204 13:35:43.998000 430550 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2795854Z dist init r=2, world=4 2025-12-04T13:38:32.2795996Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2796163Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2796464Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2796627Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2796936Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2797065Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2797364Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2797516Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2797809Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2797962Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2798250Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2798391Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2798680Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2798845Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2799381Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2799500Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2799744Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2800176Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2800298Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2800513Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2800678Z [rank1]:E1204 13:35:44.007000 430549 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2800715Z dist init r=1, world=4 2025-12-04T13:38:32.2800851Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2801010Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2801312Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2801464Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2801759Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2801882Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2802159Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2802306Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2802584Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2802731Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2803006Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2803154Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2803433Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2803579Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2804096Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T13:38:32.2804221Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2804414Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2804814Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2804927Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2805137Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2805301Z [rank3]:E1204 13:35:44.011000 430551 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2805338Z dist init r=3, world=4 2025-12-04T13:38:32.2805680Z [rank0]:[W1204 13:35:44.128536110 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2805719Z FAILED [7.6170s] [ 20%] 2025-12-04T13:38:32.2805730Z 2025-12-04T13:38:32.2805784Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2805921Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda _ 2025-12-04T13:38:32.2805966Z Traceback (most recent call last): 2025-12-04T13:38:32.2806129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2806173Z self._join_processes(fn) 2025-12-04T13:38:32.2806345Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2806398Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2806576Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2806619Z raise RuntimeError(error) 2025-12-04T13:38:32.2806697Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2806742Z Traceback (most recent call last): 2025-12-04T13:38:32.2806902Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2806943Z getattr(self, test_name)() 2025-12-04T13:38:32.2807118Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2807152Z fn() 2025-12-04T13:38:32.2807303Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2807343Z method(*args, **kwargs) 2025-12-04T13:38:32.2807494Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2807533Z method(*args, **kwargs) 2025-12-04T13:38:32.2807683Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2807719Z with policy(): 2025-12-04T13:38:32.2807871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2807911Z raise RuntimeError(msg) 2025-12-04T13:38:32.2808311Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2808314Z 2025-12-04T13:38:32.2808389Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2808662Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2808665Z 2025-12-04T13:38:32.2808751Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2808753Z 2025-12-04T13:38:32.2808754Z 2025-12-04T13:38:32.2808829Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2808915Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2809152Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c0ec8b517b4caa4f.xml - 2025-12-04T13:38:32.2809210Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2809510Z FAILED [7.6170s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2809611Z Traceback (most recent call last): 2025-12-04T13:38:32.2809776Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2809817Z getattr(self, test_name)() 2025-12-04T13:38:32.2809978Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2810012Z fn() 2025-12-04T13:38:32.2810164Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2810203Z method(*args, **kwargs) 2025-12-04T13:38:32.2810353Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2810392Z method(*args, **kwargs) 2025-12-04T13:38:32.2810541Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2810578Z with policy(): 2025-12-04T13:38:32.2810730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2810770Z raise RuntimeError(msg) 2025-12-04T13:38:32.2811163Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2811179Z 2025-12-04T13:38:32.2811252Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2811525Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2811528Z 2025-12-04T13:38:32.2811614Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2811676Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2811737Z ======================= 1 failed, 28 deselected in 7.76s ======================= 2025-12-04T13:38:32.2811774Z Got exit code 1 2025-12-04T13:38:32.2811813Z Retrying single test... 2025-12-04T13:38:32.2812018Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6f06e3aa3004e1f2.xml 2025-12-04T13:38:32.2812074Z ============================= test session starts ============================== 2025-12-04T13:38:32.2812188Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2812227Z cachedir: .pytest_cache 2025-12-04T13:38:32.2812385Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2812431Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2812471Z configfile: pytest.ini 2025-12-04T13:38:32.2812632Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2812705Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2812976Z stepcurrent: skipping 28 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2813018Z Running 1 items in this shard 2025-12-04T13:38:32.2813020Z 2025-12-04T13:38:32.2813378Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda I1204 13:35:48.295000 430881 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 430950 2025-12-04T13:38:32.2813545Z I1204 13:35:48.296000 430881 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 430951 2025-12-04T13:38:32.2813696Z I1204 13:35:48.296000 430881 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 430952 2025-12-04T13:38:32.2813845Z I1204 13:35:48.297000 430881 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 430953 2025-12-04T13:38:32.2814427Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2814464Z _warn_cpu_init() 2025-12-04T13:38:32.2815028Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2815075Z _warn_cpu_init() 2025-12-04T13:38:32.2815637Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2815674Z _warn_cpu_init() 2025-12-04T13:38:32.2816247Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2816284Z _warn_cpu_init() 2025-12-04T13:38:32.2816577Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2816620Z return func(*args, **kwargs) 2025-12-04T13:38:32.2816761Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2816922Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2817210Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2817365Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2817660Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2817784Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2818072Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2818219Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2818501Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2818648Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2818928Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2819064Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2819344Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2819500Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2820053Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2820172Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2820369Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2820783Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2820897Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2821108Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2821273Z [rank1]:E1204 13:35:54.222000 430951 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2821311Z dist init r=1, world=4 2025-12-04T13:38:32.2821447Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2821609Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2821907Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2822060Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2822355Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2822477Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2822753Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2822900Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2823177Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2823324Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2823603Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2823751Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2824027Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2824173Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2824688Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 18944 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2824814Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2825007Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2825407Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2825519Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2825732Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2825897Z [rank0]:E1204 13:35:54.237000 430950 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2825934Z dist init r=0, world=4 2025-12-04T13:38:32.2826088Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2826247Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2826544Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2826697Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2826983Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2827106Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2827385Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2827530Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2827806Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2827964Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2828242Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2828378Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2828659Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2828806Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2829336Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T13:38:32.2829451Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2829686Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2830081Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2830196Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2830422Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2830588Z [rank2]:E1204 13:35:54.296000 430952 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2830637Z dist init r=2, world=4 2025-12-04T13:38:32.2830773Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2830930Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2831217Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2831369Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2831651Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2831773Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2832047Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2832206Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2832480Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2832626Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2832908Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2833044Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2833332Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2833478Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2833993Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T13:38:32.2834106Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2834300Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2834712Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2834823Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2835045Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2835208Z [rank3]:E1204 13:35:54.300000 430953 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2835250Z dist init r=3, world=4 2025-12-04T13:38:32.2835583Z [rank0]:[W1204 13:35:54.429063152 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2835621Z FAILED [7.6167s] [100%] 2025-12-04T13:38:32.2835623Z 2025-12-04T13:38:32.2835679Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2835817Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda _ 2025-12-04T13:38:32.2835864Z Traceback (most recent call last): 2025-12-04T13:38:32.2836026Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2836069Z self._join_processes(fn) 2025-12-04T13:38:32.2836240Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2836304Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2836481Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2836525Z raise RuntimeError(error) 2025-12-04T13:38:32.2836603Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2836647Z Traceback (most recent call last): 2025-12-04T13:38:32.2836807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2836849Z getattr(self, test_name)() 2025-12-04T13:38:32.2837007Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2837041Z fn() 2025-12-04T13:38:32.2837192Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2837243Z method(*args, **kwargs) 2025-12-04T13:38:32.2837394Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2837434Z method(*args, **kwargs) 2025-12-04T13:38:32.2837584Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2837620Z with policy(): 2025-12-04T13:38:32.2837773Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2837814Z raise RuntimeError(msg) 2025-12-04T13:38:32.2838205Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 18944 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2838209Z 2025-12-04T13:38:32.2838282Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2838566Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2838568Z 2025-12-04T13:38:32.2838654Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2838666Z 2025-12-04T13:38:32.2838725Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2838768Z Traceback (most recent call last): 2025-12-04T13:38:32.2838932Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2838972Z getattr(self, test_name)() 2025-12-04T13:38:32.2839133Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2839167Z fn() 2025-12-04T13:38:32.2839317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2839357Z method(*args, **kwargs) 2025-12-04T13:38:32.2839506Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2839546Z method(*args, **kwargs) 2025-12-04T13:38:32.2839727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2839763Z with policy(): 2025-12-04T13:38:32.2839914Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2839955Z raise RuntimeError(msg) 2025-12-04T13:38:32.2840362Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2840364Z 2025-12-04T13:38:32.2840439Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2840711Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2840714Z 2025-12-04T13:38:32.2840799Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2840801Z 2025-12-04T13:38:32.2840803Z 2025-12-04T13:38:32.2840877Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2840963Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2841222Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6f06e3aa3004e1f2.xml - 2025-12-04T13:38:32.2841284Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2841570Z FAILED [7.6167s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2841616Z Traceback (most recent call last): 2025-12-04T13:38:32.2841780Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2841821Z getattr(self, test_name)() 2025-12-04T13:38:32.2841981Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2842015Z fn() 2025-12-04T13:38:32.2842167Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2842205Z method(*args, **kwargs) 2025-12-04T13:38:32.2842373Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2842412Z method(*args, **kwargs) 2025-12-04T13:38:32.2842563Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2842611Z with policy(): 2025-12-04T13:38:32.2842761Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2842801Z raise RuntimeError(msg) 2025-12-04T13:38:32.2843196Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 18944 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2843199Z 2025-12-04T13:38:32.2843273Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2843544Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2843546Z 2025-12-04T13:38:32.2843632Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2843634Z 2025-12-04T13:38:32.2843690Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2843735Z Traceback (most recent call last): 2025-12-04T13:38:32.2843895Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2843947Z getattr(self, test_name)() 2025-12-04T13:38:32.2844105Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2844138Z fn() 2025-12-04T13:38:32.2844289Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2844328Z method(*args, **kwargs) 2025-12-04T13:38:32.2844478Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2844518Z method(*args, **kwargs) 2025-12-04T13:38:32.2844667Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2844702Z with policy(): 2025-12-04T13:38:32.2844852Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2844901Z raise RuntimeError(msg) 2025-12-04T13:38:32.2845291Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2845293Z 2025-12-04T13:38:32.2845365Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2845637Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2845639Z 2025-12-04T13:38:32.2845724Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2845787Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2845848Z ======================= 1 failed, 32 deselected in 7.77s ======================= 2025-12-04T13:38:32.2845885Z Got exit code 1 2025-12-04T13:38:32.2845924Z Retrying single test... 2025-12-04T13:38:32.2846125Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f99384b36d0e58a6.xml 2025-12-04T13:38:32.2846186Z ============================= test session starts ============================== 2025-12-04T13:38:32.2846298Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2846351Z cachedir: .pytest_cache 2025-12-04T13:38:32.2846510Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2846555Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2846593Z configfile: pytest.ini 2025-12-04T13:38:32.2846759Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2846831Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2847099Z stepcurrent: skipping 28 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2847141Z Running 1 items in this shard 2025-12-04T13:38:32.2847143Z 2025-12-04T13:38:32.2847490Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda I1204 13:35:58.654000 431283 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 431352 2025-12-04T13:38:32.2847645Z I1204 13:35:58.654000 431283 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 431353 2025-12-04T13:38:32.2847807Z I1204 13:35:58.655000 431283 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 431354 2025-12-04T13:38:32.2847960Z I1204 13:35:58.656000 431283 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 431355 2025-12-04T13:38:32.2848540Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2848579Z _warn_cpu_init() 2025-12-04T13:38:32.2849160Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2849199Z _warn_cpu_init() 2025-12-04T13:38:32.2849797Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2849833Z _warn_cpu_init() 2025-12-04T13:38:32.2850127Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2850169Z return func(*args, **kwargs) 2025-12-04T13:38:32.2850747Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2850796Z _warn_cpu_init() 2025-12-04T13:38:32.2850938Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2851101Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2851393Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2851550Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2851835Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2851959Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2852242Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2852403Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2852681Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2852829Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2853107Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2853243Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2853534Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2853682Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2854197Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2854312Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2854507Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2854913Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2855035Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2855244Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2855408Z [rank1]:E1204 13:36:04.543000 431353 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2855449Z dist init r=1, world=4 2025-12-04T13:38:32.2855585Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2855744Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2856033Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2856186Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2856472Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2856610Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2856887Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2857032Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2857311Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2857459Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2857746Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2857882Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2858158Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2858306Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2858822Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T13:38:32.2858946Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2859140Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2859547Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2859701Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2859911Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2860076Z [rank3]:E1204 13:36:04.589000 431355 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2860113Z dist init r=3, world=4 2025-12-04T13:38:32.2860250Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2860409Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2860698Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2860868Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2861151Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2861273Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2861548Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2861694Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2861982Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2862128Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2862402Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2862537Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2862817Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2862965Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2863492Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2863619Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2863811Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2864212Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2864325Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2864535Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2864698Z [rank0]:E1204 13:36:04.590000 431352 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2864738Z dist init r=0, world=4 2025-12-04T13:38:32.2864873Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2865043Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2865333Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2865485Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2865768Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2865890Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2866176Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2866321Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2866598Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2866744Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2867019Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2867157Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2867444Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2867592Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2868116Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T13:38:32.2868232Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2868430Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2868829Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2868942Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2869151Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2869324Z [rank2]:E1204 13:36:04.599000 431354 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2869361Z dist init r=2, world=4 2025-12-04T13:38:32.2869721Z [rank0]:[W1204 13:36:04.856590047 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2869759Z FAILED [7.5162s] [100%] 2025-12-04T13:38:32.2869763Z 2025-12-04T13:38:32.2869818Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2869958Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda _ 2025-12-04T13:38:32.2870003Z Traceback (most recent call last): 2025-12-04T13:38:32.2870168Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2870227Z self._join_processes(fn) 2025-12-04T13:38:32.2870401Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2870453Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2870632Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2870675Z raise RuntimeError(error) 2025-12-04T13:38:32.2870755Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2870798Z Traceback (most recent call last): 2025-12-04T13:38:32.2870960Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2871002Z getattr(self, test_name)() 2025-12-04T13:38:32.2871163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2871196Z fn() 2025-12-04T13:38:32.2871346Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2871387Z method(*args, **kwargs) 2025-12-04T13:38:32.2871550Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2871589Z method(*args, **kwargs) 2025-12-04T13:38:32.2871758Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2871794Z with policy(): 2025-12-04T13:38:32.2871948Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2871987Z raise RuntimeError(msg) 2025-12-04T13:38:32.2872383Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2872386Z 2025-12-04T13:38:32.2872462Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2872735Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2872738Z 2025-12-04T13:38:32.2872826Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2872829Z 2025-12-04T13:38:32.2872830Z 2025-12-04T13:38:32.2872903Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2873008Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2873242Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f99384b36d0e58a6.xml - 2025-12-04T13:38:32.2873302Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2873592Z FAILED [7.5162s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2873637Z Traceback (most recent call last): 2025-12-04T13:38:32.2873801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2873842Z getattr(self, test_name)() 2025-12-04T13:38:32.2874002Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2874035Z fn() 2025-12-04T13:38:32.2874199Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2874237Z method(*args, **kwargs) 2025-12-04T13:38:32.2874390Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2874428Z method(*args, **kwargs) 2025-12-04T13:38:32.2874579Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2874617Z with policy(): 2025-12-04T13:38:32.2874770Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2874809Z raise RuntimeError(msg) 2025-12-04T13:38:32.2875203Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2875206Z 2025-12-04T13:38:32.2875278Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2875562Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2875573Z 2025-12-04T13:38:32.2875659Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2875723Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2875786Z ======================= 1 failed, 32 deselected in 7.68s ======================= 2025-12-04T13:38:32.2878374Z Got exit code 1 2025-12-04T13:38:32.2878604Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda 2025-12-04T13:38:32.2878734Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2878922Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-04695ebb741ba6d1.xml 2025-12-04T13:38:32.2878981Z ============================= test session starts ============================== 2025-12-04T13:38:32.2879095Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2879137Z cachedir: .pytest_cache 2025-12-04T13:38:32.2879295Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2879341Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2879398Z configfile: pytest.ini 2025-12-04T13:38:32.2879561Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2879671Z collecting ... collected 60 items / 29 deselected / 31 selected 2025-12-04T13:38:32.2879724Z stepcurrent: skipping 29 already run items. 2025-12-04T13:38:32.2879767Z Running 4 items in this shard 2025-12-04T13:38:32.2879770Z 2025-12-04T13:38:32.2880131Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda I1204 13:36:08.755000 431685 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 431754 2025-12-04T13:38:32.2880288Z I1204 13:36:08.756000 431685 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 431755 2025-12-04T13:38:32.2880439Z I1204 13:36:08.756000 431685 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 431756 2025-12-04T13:38:32.2880611Z I1204 13:36:08.757000 431685 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 431757 2025-12-04T13:38:32.2881195Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2881234Z _warn_cpu_init() 2025-12-04T13:38:32.2881806Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2881845Z _warn_cpu_init() 2025-12-04T13:38:32.2882149Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2882191Z return func(*args, **kwargs) 2025-12-04T13:38:32.2882778Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2882815Z _warn_cpu_init() 2025-12-04T13:38:32.2883386Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2883423Z _warn_cpu_init() 2025-12-04T13:38:32.2883565Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2883726Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2884028Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2884184Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2884469Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2884595Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2884872Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2885030Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2885307Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2885452Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2885728Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2885862Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2886142Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2886289Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2886837Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T13:38:32.2886961Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2887158Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2887574Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2887689Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2887901Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2888065Z [rank3]:E1204 13:36:14.649000 431757 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2888114Z dist init r=3, world=4 2025-12-04T13:38:32.2888252Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2888412Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2888702Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2888857Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2889141Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2889275Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2889552Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2889717Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2889991Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2890137Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2890411Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2890548Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2890845Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2891005Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2891541Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T13:38:32.2891655Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2891851Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2892261Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2892374Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2892601Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2892763Z [rank2]:E1204 13:36:14.651000 431756 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2892803Z dist init r=2, world=4 2025-12-04T13:38:32.2892938Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2893098Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2893384Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2893551Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2893835Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2893958Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2894232Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2894378Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2894654Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2894799Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2895083Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2895225Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2895500Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2895649Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2896178Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2896291Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2896485Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2896908Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2897020Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2897229Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2897393Z [rank0]:E1204 13:36:14.716000 431754 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2897430Z dist init r=0, world=4 2025-12-04T13:38:32.2897566Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2897734Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2898020Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2898174Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2898459Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2898580Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2898857Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2899002Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2899285Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2899442Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2899761Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2899899Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2900177Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2900324Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2900855Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2900982Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2901177Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2901586Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2901699Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2901909Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2902087Z [rank1]:E1204 13:36:14.729000 431755 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2902125Z dist init r=1, world=4 2025-12-04T13:38:32.2902459Z [rank0]:[W1204 13:36:15.087724731 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2902498Z FAILED [7.5158s] [ 25%] 2025-12-04T13:38:32.2902501Z 2025-12-04T13:38:32.2902557Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2902707Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2902752Z Traceback (most recent call last): 2025-12-04T13:38:32.2902915Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2902959Z self._join_processes(fn) 2025-12-04T13:38:32.2903131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2903184Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2903375Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2903418Z raise RuntimeError(error) 2025-12-04T13:38:32.2903510Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2903554Z Traceback (most recent call last): 2025-12-04T13:38:32.2903715Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2903755Z getattr(self, test_name)() 2025-12-04T13:38:32.2903916Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2903950Z fn() 2025-12-04T13:38:32.2904101Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2904141Z method(*args, **kwargs) 2025-12-04T13:38:32.2904293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2904333Z method(*args, **kwargs) 2025-12-04T13:38:32.2904485Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2904521Z with policy(): 2025-12-04T13:38:32.2904673Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2904713Z raise RuntimeError(msg) 2025-12-04T13:38:32.2905125Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T13:38:32.2905128Z 2025-12-04T13:38:32.2905203Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2905488Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2905491Z 2025-12-04T13:38:32.2905579Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2905581Z 2025-12-04T13:38:32.2905583Z 2025-12-04T13:38:32.2905659Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2905746Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2905994Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-04695ebb741ba6d1.xml - 2025-12-04T13:38:32.2906055Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2906350Z FAILED [7.5158s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.2906395Z Traceback (most recent call last): 2025-12-04T13:38:32.2906558Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2906599Z getattr(self, test_name)() 2025-12-04T13:38:32.2906759Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2906795Z fn() 2025-12-04T13:38:32.2906949Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2906988Z method(*args, **kwargs) 2025-12-04T13:38:32.2907149Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2907188Z method(*args, **kwargs) 2025-12-04T13:38:32.2907340Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2907387Z with policy(): 2025-12-04T13:38:32.2907538Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2907578Z raise RuntimeError(msg) 2025-12-04T13:38:32.2907978Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T13:38:32.2907981Z 2025-12-04T13:38:32.2908054Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2908335Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2908338Z 2025-12-04T13:38:32.2908424Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2908486Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2908547Z ======================= 1 failed, 29 deselected in 7.68s ======================= 2025-12-04T13:38:32.2908593Z Got exit code 1 2025-12-04T13:38:32.2908633Z Retrying single test... 2025-12-04T13:38:32.2908821Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-bcef664f761bd15e.xml 2025-12-04T13:38:32.2908879Z ============================= test session starts ============================== 2025-12-04T13:38:32.2908991Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2909031Z cachedir: .pytest_cache 2025-12-04T13:38:32.2909189Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2909235Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2909274Z configfile: pytest.ini 2025-12-04T13:38:32.2909436Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2909511Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2909839Z stepcurrent: skipping 29 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2909883Z Running 1 items in this shard 2025-12-04T13:38:32.2909885Z 2025-12-04T13:38:32.2910239Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda I1204 13:36:19.003000 432087 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 432156 2025-12-04T13:38:32.2910395Z I1204 13:36:19.004000 432087 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 432157 2025-12-04T13:38:32.2910545Z I1204 13:36:19.005000 432087 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 432158 2025-12-04T13:38:32.2910696Z I1204 13:36:19.005000 432087 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 432159 2025-12-04T13:38:32.2911296Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2911346Z _warn_cpu_init() 2025-12-04T13:38:32.2911921Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2911958Z _warn_cpu_init() 2025-12-04T13:38:32.2912526Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2912563Z _warn_cpu_init() 2025-12-04T13:38:32.2913128Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2913177Z _warn_cpu_init() 2025-12-04T13:38:32.2913467Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2913509Z return func(*args, **kwargs) 2025-12-04T13:38:32.2913650Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2913811Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2914111Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2914268Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2914554Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2914678Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2914957Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2915106Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2915386Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2915542Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2915815Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2915961Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2916236Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2916385Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2916914Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T13:38:32.2917030Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2917227Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2917646Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2917758Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2917967Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2918132Z [rank2]:E1204 13:36:24.924000 432158 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2918170Z dist init r=2, world=4 2025-12-04T13:38:32.2918316Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2918474Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2918763Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2918916Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2919198Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2919323Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2919674Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2919835Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2920110Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2920271Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2920547Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2920681Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2920959Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2921106Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2921639Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2921773Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2921966Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2922375Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2922488Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2922715Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2922877Z [rank0]:E1204 13:36:24.931000 432156 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2922915Z dist init r=0, world=4 2025-12-04T13:38:32.2923052Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2923210Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2923501Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2923656Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2923938Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2924069Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2924344Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2924500Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2924776Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2924922Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2925196Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2925333Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2925609Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2925769Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2926294Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T13:38:32.2926408Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2926602Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2927020Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2927133Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2927344Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2927508Z [rank3]:E1204 13:36:24.965000 432159 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2927546Z dist init r=3, world=4 2025-12-04T13:38:32.2927682Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2927839Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2928130Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2928293Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2928575Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2928707Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2928982Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2929129Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2929408Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2929554Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2929855Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2930006Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2930282Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2930433Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2930961Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2931075Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2931284Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2931694Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2931806Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2932015Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2932179Z [rank1]:E1204 13:36:24.976000 432157 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2932216Z dist init r=1, world=4 2025-12-04T13:38:32.2932562Z [rank0]:[W1204 13:36:25.125010451 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2932601Z FAILED [7.6174s] [100%] 2025-12-04T13:38:32.2932603Z 2025-12-04T13:38:32.2932658Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2932821Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2932868Z Traceback (most recent call last): 2025-12-04T13:38:32.2933030Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2933074Z self._join_processes(fn) 2025-12-04T13:38:32.2933246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2933299Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2933476Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2933519Z raise RuntimeError(error) 2025-12-04T13:38:32.2933597Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2933642Z Traceback (most recent call last): 2025-12-04T13:38:32.2933801Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2933843Z getattr(self, test_name)() 2025-12-04T13:38:32.2933999Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2934044Z fn() 2025-12-04T13:38:32.2934194Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2934235Z method(*args, **kwargs) 2025-12-04T13:38:32.2934384Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2934424Z method(*args, **kwargs) 2025-12-04T13:38:32.2934574Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2934613Z with policy(): 2025-12-04T13:38:32.2934765Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2934805Z raise RuntimeError(msg) 2025-12-04T13:38:32.2935216Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2935220Z 2025-12-04T13:38:32.2935294Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2935583Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2935586Z 2025-12-04T13:38:32.2935672Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2935674Z 2025-12-04T13:38:32.2935732Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2935775Z Traceback (most recent call last): 2025-12-04T13:38:32.2935938Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2935980Z getattr(self, test_name)() 2025-12-04T13:38:32.2936140Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2936174Z fn() 2025-12-04T13:38:32.2936334Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2936373Z method(*args, **kwargs) 2025-12-04T13:38:32.2936525Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2936581Z method(*args, **kwargs) 2025-12-04T13:38:32.2936730Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2936766Z with policy(): 2025-12-04T13:38:32.2936918Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2936961Z raise RuntimeError(msg) 2025-12-04T13:38:32.2937361Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T13:38:32.2937363Z 2025-12-04T13:38:32.2937437Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2937724Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2937726Z 2025-12-04T13:38:32.2937812Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2937827Z 2025-12-04T13:38:32.2937828Z 2025-12-04T13:38:32.2937904Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2937994Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2938229Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-bcef664f761bd15e.xml - 2025-12-04T13:38:32.2938293Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2938587Z FAILED [7.6174s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2938635Z Traceback (most recent call last): 2025-12-04T13:38:32.2938798Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2938844Z getattr(self, test_name)() 2025-12-04T13:38:32.2939016Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2939055Z fn() 2025-12-04T13:38:32.2939210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2939254Z method(*args, **kwargs) 2025-12-04T13:38:32.2939408Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2939451Z method(*args, **kwargs) 2025-12-04T13:38:32.2939641Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2939682Z with policy(): 2025-12-04T13:38:32.2939838Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2939880Z raise RuntimeError(msg) 2025-12-04T13:38:32.2940292Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2940295Z 2025-12-04T13:38:32.2940369Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2940670Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2940673Z 2025-12-04T13:38:32.2940759Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2940761Z 2025-12-04T13:38:32.2940824Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.2940869Z Traceback (most recent call last): 2025-12-04T13:38:32.2941033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2941074Z getattr(self, test_name)() 2025-12-04T13:38:32.2941237Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2941274Z fn() 2025-12-04T13:38:32.2941424Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2941467Z method(*args, **kwargs) 2025-12-04T13:38:32.2941617Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2941658Z method(*args, **kwargs) 2025-12-04T13:38:32.2941808Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2941865Z with policy(): 2025-12-04T13:38:32.2942018Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2942061Z raise RuntimeError(msg) 2025-12-04T13:38:32.2942459Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T13:38:32.2942462Z 2025-12-04T13:38:32.2942538Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2942819Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2942822Z 2025-12-04T13:38:32.2942923Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2942988Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2943048Z ======================= 1 failed, 32 deselected in 7.78s ======================= 2025-12-04T13:38:32.2943089Z Got exit code 1 2025-12-04T13:38:32.2943130Z Retrying single test... 2025-12-04T13:38:32.2943324Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-cb3378da80874a3f.xml 2025-12-04T13:38:32.2943383Z ============================= test session starts ============================== 2025-12-04T13:38:32.2943497Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2943537Z cachedir: .pytest_cache 2025-12-04T13:38:32.2943698Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2943746Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2943790Z configfile: pytest.ini 2025-12-04T13:38:32.2943953Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2944039Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.2944319Z stepcurrent: skipping 29 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2944377Z Running 1 items in this shard 2025-12-04T13:38:32.2944379Z 2025-12-04T13:38:32.2944734Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda I1204 13:36:29.118000 432489 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 432558 2025-12-04T13:38:32.2944892Z I1204 13:36:29.119000 432489 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 432559 2025-12-04T13:38:32.2945048Z I1204 13:36:29.119000 432489 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 432560 2025-12-04T13:38:32.2945200Z I1204 13:36:29.120000 432489 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 432561 2025-12-04T13:38:32.2945785Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2945833Z _warn_cpu_init() 2025-12-04T13:38:32.2946406Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2946446Z _warn_cpu_init() 2025-12-04T13:38:32.2946742Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2946787Z return func(*args, **kwargs) 2025-12-04T13:38:32.2947369Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2947409Z _warn_cpu_init() 2025-12-04T13:38:32.2947974Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2948015Z _warn_cpu_init() 2025-12-04T13:38:32.2948161Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2948321Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2948623Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2948791Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2949082Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2949208Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2949492Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2949683Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2949962Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2950111Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2950392Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2950542Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2950820Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2950970Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2951523Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2951639Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2951836Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2952249Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2952365Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2952577Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2952743Z [rank1]:E1204 13:36:35.113000 432559 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2952784Z dist init r=1, world=4 2025-12-04T13:38:32.2952939Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2953113Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2953401Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2953559Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2953843Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2953969Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2954247Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2954395Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2954674Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2954832Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2955111Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2955247Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2955527Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2955684Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2956215Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 2. CUDA driver allocated memory was 2300575744 and is now 3418357760. 2025-12-04T13:38:32.2956332Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2956526Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2956937Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2957049Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2957271Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2957446Z [rank2]:E1204 13:36:35.142000 432560 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2957487Z dist init r=2, world=4 2025-12-04T13:38:32.2957626Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2957789Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2958080Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2958235Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2958522Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2958647Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2958928Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2959085Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2959363Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2959512Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2959819Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2959970Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2960248Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2960399Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2960931Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3368026112. 2025-12-04T13:38:32.2961046Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2961245Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2961667Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2961795Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2962004Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2962172Z [rank3]:E1204 13:36:35.148000 432561 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2962209Z dist init r=3, world=4 2025-12-04T13:38:32.2962347Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2962510Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2962799Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2962956Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2963241Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2963382Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2963661Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2963811Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2964093Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2964248Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2964525Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2964661Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2964940Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2965088Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2965624Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 18944 on device 0. CUDA driver allocated memory was 2453667840 and is now 3571449856. 2025-12-04T13:38:32.2965751Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2965946Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2966366Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2966479Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2966687Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2966852Z [rank0]:E1204 13:36:35.165000 432558 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2966892Z dist init r=0, world=4 2025-12-04T13:38:32.2967228Z [rank0]:[W1204 13:36:35.429004777 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2967266Z FAILED [7.7164s] [100%] 2025-12-04T13:38:32.2967268Z 2025-12-04T13:38:32.2967326Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2967493Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda _ 2025-12-04T13:38:32.2967542Z Traceback (most recent call last): 2025-12-04T13:38:32.2967708Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2967756Z self._join_processes(fn) 2025-12-04T13:38:32.2967929Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2967985Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2968163Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2968208Z raise RuntimeError(error) 2025-12-04T13:38:32.2968287Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2968336Z Traceback (most recent call last): 2025-12-04T13:38:32.2968515Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2968561Z getattr(self, test_name)() 2025-12-04T13:38:32.2968721Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2968758Z fn() 2025-12-04T13:38:32.2968910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2968954Z method(*args, **kwargs) 2025-12-04T13:38:32.2969104Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2969148Z method(*args, **kwargs) 2025-12-04T13:38:32.2969302Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2969340Z with policy(): 2025-12-04T13:38:32.2969498Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2969539Z raise RuntimeError(msg) 2025-12-04T13:38:32.2969989Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2970003Z 2025-12-04T13:38:32.2970078Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2970365Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2970368Z 2025-12-04T13:38:32.2970457Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2970459Z 2025-12-04T13:38:32.2970464Z 2025-12-04T13:38:32.2970539Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.2970630Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.2970862Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-cb3378da80874a3f.xml - 2025-12-04T13:38:32.2970924Z =========================== short test summary info ============================ 2025-12-04T13:38:32.2971226Z FAILED [7.7164s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.2971288Z Traceback (most recent call last): 2025-12-04T13:38:32.2971453Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2971497Z getattr(self, test_name)() 2025-12-04T13:38:32.2971657Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2971694Z fn() 2025-12-04T13:38:32.2971847Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2971889Z method(*args, **kwargs) 2025-12-04T13:38:32.2972041Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2972084Z method(*args, **kwargs) 2025-12-04T13:38:32.2972235Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2972274Z with policy(): 2025-12-04T13:38:32.2972443Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2972485Z raise RuntimeError(msg) 2025-12-04T13:38:32.2972885Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 14848 on device 1. CUDA driver allocated memory was 2317352960 and is now 3435134976. 2025-12-04T13:38:32.2972888Z 2025-12-04T13:38:32.2972964Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2973248Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2973251Z 2025-12-04T13:38:32.2973338Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2973402Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.2973463Z ======================= 1 failed, 32 deselected in 7.87s ======================= 2025-12-04T13:38:32.2973503Z Got exit code 1 2025-12-04T13:38:32.2973745Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.2973885Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.2974070Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5126715a40a2f0a6.xml 2025-12-04T13:38:32.2974128Z ============================= test session starts ============================== 2025-12-04T13:38:32.2974245Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.2974286Z cachedir: .pytest_cache 2025-12-04T13:38:32.2974448Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.2974494Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.2974538Z configfile: pytest.ini 2025-12-04T13:38:32.2974701Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.2974778Z collecting ... collected 60 items / 30 deselected / 30 selected 2025-12-04T13:38:32.2974831Z stepcurrent: skipping 30 already run items. 2025-12-04T13:38:32.2974877Z Running 3 items in this shard 2025-12-04T13:38:32.2974878Z 2025-12-04T13:38:32.2975177Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda I1204 13:36:39.371000 432891 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 432960 2025-12-04T13:38:32.2975344Z I1204 13:36:39.372000 432891 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 432961 2025-12-04T13:38:32.2975495Z I1204 13:36:39.372000 432891 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 432962 2025-12-04T13:38:32.2975649Z I1204 13:36:39.373000 432891 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 432963 2025-12-04T13:38:32.2976012Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.2976064Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.2976428Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.2976477Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.2976831Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.2976877Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.2977233Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.2977276Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.2977874Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2977916Z _warn_cpu_init() 2025-12-04T13:38:32.2978486Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2978536Z _warn_cpu_init() 2025-12-04T13:38:32.2979106Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2979146Z _warn_cpu_init() 2025-12-04T13:38:32.2979752Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.2979802Z _warn_cpu_init() 2025-12-04T13:38:32.2980097Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.2980139Z return func(*args, **kwargs) 2025-12-04T13:38:32.2980285Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2980448Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2980741Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2980912Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2981199Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2981326Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2981602Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2981753Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2982031Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2982181Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2982467Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2982617Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2982896Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2983045Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2983519Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:38:32.2983634Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2983832Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2984181Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.2984310Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2984524Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2984687Z [rank1]:E1204 13:36:48.547000 432961 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.2984731Z dist init r=1, world=4 2025-12-04T13:38:32.2984869Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2985029Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2985327Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2985483Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2985767Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2985892Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2986172Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2986321Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2986607Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2986753Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2987040Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2987175Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2987456Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2987607Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2988078Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:38:32.2988195Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2988400Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2988757Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.2988874Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2989085Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2989252Z [rank0]:E1204 13:36:48.548000 432960 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.2989291Z dist init r=0, world=4 2025-12-04T13:38:32.2989439Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2989630Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2989920Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2990072Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2990360Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2990486Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2990763Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2990927Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2991217Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2991363Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2991640Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2991777Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2992055Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2992206Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2992677Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T13:38:32.2992802Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2992999Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2993343Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.2993459Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2993686Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2993850Z [rank2]:E1204 13:36:48.590000 432962 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.2993890Z dist init r=2, world=4 2025-12-04T13:38:32.2994028Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.2994188Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.2994476Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.2994632Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.2994917Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.2995051Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.2995330Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2995485Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2995764Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.2995912Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.2996192Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.2996328Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.2996606Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.2996754Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.2997235Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 261632 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:38:32.2997348Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2997541Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.2997883Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.2998004Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.2998214Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.2998377Z [rank3]:E1204 13:36:48.591000 432963 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.2998414Z dist init r=3, world=4 2025-12-04T13:38:32.2998750Z [rank0]:[W1204 13:36:48.751891137 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.2998790Z FAILED [11.1208s] [ 33%] 2025-12-04T13:38:32.2998793Z 2025-12-04T13:38:32.2998849Z =================================== FAILURES =================================== 2025-12-04T13:38:32.2998944Z ________ TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda ________ 2025-12-04T13:38:32.2998989Z Traceback (most recent call last): 2025-12-04T13:38:32.2999161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.2999204Z self._join_processes(fn) 2025-12-04T13:38:32.2999376Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.2999446Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.2999662Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.2999705Z raise RuntimeError(error) 2025-12-04T13:38:32.2999785Z RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.2999830Z Traceback (most recent call last): 2025-12-04T13:38:32.2999992Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3000034Z getattr(self, test_name)() 2025-12-04T13:38:32.3000196Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3000230Z fn() 2025-12-04T13:38:32.3000380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3000421Z method(*args, **kwargs) 2025-12-04T13:38:32.3000573Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3000612Z method(*args, **kwargs) 2025-12-04T13:38:32.3000764Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3000813Z with policy(): 2025-12-04T13:38:32.3000967Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3001006Z raise RuntimeError(msg) 2025-12-04T13:38:32.3001349Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:38:32.3001353Z 2025-12-04T13:38:32.3001426Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3001642Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3001645Z 2025-12-04T13:38:32.3001732Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3001746Z 2025-12-04T13:38:32.3001806Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.3001849Z Traceback (most recent call last): 2025-12-04T13:38:32.3002012Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3002054Z getattr(self, test_name)() 2025-12-04T13:38:32.3002213Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3002248Z fn() 2025-12-04T13:38:32.3002398Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3002437Z method(*args, **kwargs) 2025-12-04T13:38:32.3002590Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3002630Z method(*args, **kwargs) 2025-12-04T13:38:32.3002782Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3002818Z with policy(): 2025-12-04T13:38:32.3002982Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3003023Z raise RuntimeError(msg) 2025-12-04T13:38:32.3003358Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:38:32.3003373Z 2025-12-04T13:38:32.3003447Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3003661Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3003665Z 2025-12-04T13:38:32.3003751Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3003753Z 2025-12-04T13:38:32.3003755Z 2025-12-04T13:38:32.3003829Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3003915Z Process 0 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3004147Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5126715a40a2f0a6.xml - 2025-12-04T13:38:32.3004207Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3004443Z FAILED [11.1208s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception: 2025-12-04T13:38:32.3004496Z Traceback (most recent call last): 2025-12-04T13:38:32.3004665Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3004706Z getattr(self, test_name)() 2025-12-04T13:38:32.3004870Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3004903Z fn() 2025-12-04T13:38:32.3005055Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3005095Z method(*args, **kwargs) 2025-12-04T13:38:32.3005245Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3005283Z method(*args, **kwargs) 2025-12-04T13:38:32.3005432Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3005468Z with policy(): 2025-12-04T13:38:32.3005628Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3005669Z raise RuntimeError(msg) 2025-12-04T13:38:32.3006009Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:38:32.3006012Z 2025-12-04T13:38:32.3006085Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3006298Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3006300Z 2025-12-04T13:38:32.3006385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3006388Z 2025-12-04T13:38:32.3006445Z Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.3006490Z Traceback (most recent call last): 2025-12-04T13:38:32.3006651Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3006701Z getattr(self, test_name)() 2025-12-04T13:38:32.3006859Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3006904Z fn() 2025-12-04T13:38:32.3007054Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3007093Z method(*args, **kwargs) 2025-12-04T13:38:32.3007242Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3007282Z method(*args, **kwargs) 2025-12-04T13:38:32.3007433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3007468Z with policy(): 2025-12-04T13:38:32.3007620Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3007659Z raise RuntimeError(msg) 2025-12-04T13:38:32.3007995Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:38:32.3007998Z 2025-12-04T13:38:32.3008070Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3008283Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3008298Z 2025-12-04T13:38:32.3008383Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3008446Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3008508Z ====================== 1 failed, 30 deselected in 11.26s ======================= 2025-12-04T13:38:32.3008545Z Got exit code 1 2025-12-04T13:38:32.3008583Z Retrying single test... 2025-12-04T13:38:32.3008775Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e5bd795f8999e2c0.xml 2025-12-04T13:38:32.3008833Z ============================= test session starts ============================== 2025-12-04T13:38:32.3008947Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3008987Z cachedir: .pytest_cache 2025-12-04T13:38:32.3009149Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3009222Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3009266Z configfile: pytest.ini 2025-12-04T13:38:32.3009428Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3009501Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.3009744Z stepcurrent: skipping 30 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3009787Z Running 1 items in this shard 2025-12-04T13:38:32.3009789Z 2025-12-04T13:38:32.3010085Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda I1204 13:36:53.160000 433293 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 433362 2025-12-04T13:38:32.3010240Z I1204 13:36:53.160000 433293 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 433363 2025-12-04T13:38:32.3010392Z I1204 13:36:53.161000 433293 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 433364 2025-12-04T13:38:32.3010557Z I1204 13:36:53.161000 433293 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 433365 2025-12-04T13:38:32.3010915Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3010975Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3011327Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3011374Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3011726Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3011770Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3012119Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3012163Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3012743Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3012792Z _warn_cpu_init() 2025-12-04T13:38:32.3013359Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3013396Z _warn_cpu_init() 2025-12-04T13:38:32.3013985Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3014022Z _warn_cpu_init() 2025-12-04T13:38:32.3014586Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3014624Z _warn_cpu_init() 2025-12-04T13:38:32.3014914Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.3014956Z return func(*args, **kwargs) 2025-12-04T13:38:32.3015113Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3015276Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3015577Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3015732Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3016022Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3016147Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3016428Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3016574Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3016852Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3017007Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3017284Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3017421Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3017695Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3017842Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3018321Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:38:32.3018436Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3018631Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3018976Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3019090Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3019309Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3019472Z [rank1]:E1204 13:37:02.389000 433363 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.3019519Z dist init r=1, world=4 2025-12-04T13:38:32.3019695Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3019853Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3020143Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3020295Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3020581Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3020705Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3020981Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3021140Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3021416Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3021564Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3021840Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3021974Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3022263Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3022408Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3022874Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 261632 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:38:32.3022987Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3023186Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3023527Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3023653Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3023864Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3024039Z [rank3]:E1204 13:37:02.441000 433365 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.3024077Z dist init r=3, world=4 2025-12-04T13:38:32.3024213Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3024371Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3024656Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3024808Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3025092Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3025215Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3025506Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3025652Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3025927Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3026072Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3026366Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3026501Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3026782Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3026927Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3027394Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:38:32.3027510Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3027706Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3028063Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3028187Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3028394Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3028559Z [rank0]:E1204 13:37:02.455000 433362 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.3028595Z dist init r=0, world=4 2025-12-04T13:38:32.3028733Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3028891Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3029178Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3029330Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3029650Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3029771Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3030052Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3030199Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3030473Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3030642Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3030917Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3031052Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3031328Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3031475Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3031941Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T13:38:32.3032065Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3032259Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3032616Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3032730Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3032939Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3033103Z [rank2]:E1204 13:37:02.459000 433364 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.3033141Z dist init r=2, world=4 2025-12-04T13:38:32.3033473Z [rank0]:[W1204 13:37:02.840032493 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.3033514Z FAILED [11.2221s] [100%] 2025-12-04T13:38:32.3033516Z 2025-12-04T13:38:32.3033570Z =================================== FAILURES =================================== 2025-12-04T13:38:32.3033682Z ________ TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda ________ 2025-12-04T13:38:32.3033726Z Traceback (most recent call last): 2025-12-04T13:38:32.3033891Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.3033932Z self._join_processes(fn) 2025-12-04T13:38:32.3034106Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.3034159Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.3034336Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.3034378Z raise RuntimeError(error) 2025-12-04T13:38:32.3034457Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.3034501Z Traceback (most recent call last): 2025-12-04T13:38:32.3034671Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3034713Z getattr(self, test_name)() 2025-12-04T13:38:32.3034871Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3034905Z fn() 2025-12-04T13:38:32.3035055Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3035096Z method(*args, **kwargs) 2025-12-04T13:38:32.3035246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3035285Z method(*args, **kwargs) 2025-12-04T13:38:32.3035434Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3035472Z with policy(): 2025-12-04T13:38:32.3035625Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3035665Z raise RuntimeError(msg) 2025-12-04T13:38:32.3036019Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:38:32.3036021Z 2025-12-04T13:38:32.3036108Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3036323Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3036325Z 2025-12-04T13:38:32.3036411Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3036414Z 2025-12-04T13:38:32.3036416Z 2025-12-04T13:38:32.3036492Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3036578Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3036813Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-e5bd795f8999e2c0.xml - 2025-12-04T13:38:32.3036873Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3037108Z FAILED [11.2221s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.3037153Z Traceback (most recent call last): 2025-12-04T13:38:32.3037317Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3037368Z getattr(self, test_name)() 2025-12-04T13:38:32.3037529Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3037562Z fn() 2025-12-04T13:38:32.3037714Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3037754Z method(*args, **kwargs) 2025-12-04T13:38:32.3037906Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3037945Z method(*args, **kwargs) 2025-12-04T13:38:32.3038094Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3038129Z with policy(): 2025-12-04T13:38:32.3038281Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3038321Z raise RuntimeError(msg) 2025-12-04T13:38:32.3038672Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:38:32.3038675Z 2025-12-04T13:38:32.3038749Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3038963Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3038967Z 2025-12-04T13:38:32.3039052Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3039114Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3039175Z ====================== 1 failed, 32 deselected in 11.38s ======================= 2025-12-04T13:38:32.3039211Z Got exit code 1 2025-12-04T13:38:32.3039251Z Retrying single test... 2025-12-04T13:38:32.3039440Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-761ce9b29c6b03b5.xml 2025-12-04T13:38:32.3039498Z ============================= test session starts ============================== 2025-12-04T13:38:32.3039650Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3039691Z cachedir: .pytest_cache 2025-12-04T13:38:32.3039863Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3039909Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3039948Z configfile: pytest.ini 2025-12-04T13:38:32.3040110Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3040184Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.3040398Z stepcurrent: skipping 30 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3040440Z Running 1 items in this shard 2025-12-04T13:38:32.3040442Z 2025-12-04T13:38:32.3040735Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda I1204 13:37:07.182000 433695 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 433764 2025-12-04T13:38:32.3040889Z I1204 13:37:07.183000 433695 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 433765 2025-12-04T13:38:32.3041039Z I1204 13:37:07.183000 433695 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 433766 2025-12-04T13:38:32.3041189Z I1204 13:37:07.184000 433695 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 433767 2025-12-04T13:38:32.3041560Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3041609Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3041960Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3042007Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3042358Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3042414Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3042767Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3042810Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3043389Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3043426Z _warn_cpu_init() 2025-12-04T13:38:32.3044016Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3044054Z _warn_cpu_init() 2025-12-04T13:38:32.3044628Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3044667Z _warn_cpu_init() 2025-12-04T13:38:32.3045236Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3045273Z _warn_cpu_init() 2025-12-04T13:38:32.3045564Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.3045605Z return func(*args, **kwargs) 2025-12-04T13:38:32.3045765Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3045927Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3046219Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3046374Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3046659Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3046785Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3047071Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3047220Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3047495Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3047642Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3047916Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3048053Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3048344Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3048500Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3048972Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:38:32.3049086Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3049284Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3049655Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3049769Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3049980Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3050156Z [rank3]:E1204 13:37:16.386000 433767 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.3050194Z dist init r=3, world=4 2025-12-04T13:38:32.3050331Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3050490Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3050777Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3050930Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3051226Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3051349Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3051625Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3051772Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3052047Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3052194Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3052493Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3052628Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3052922Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3053071Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3053542Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088. 2025-12-04T13:38:32.3053656Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3053848Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3054191Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3054313Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3054525Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3054689Z [rank1]:E1204 13:37:16.399000 433765 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.3054726Z dist init r=1, world=4 2025-12-04T13:38:32.3054863Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3055022Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3055319Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3055472Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3055757Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3055878Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3056153Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3056299Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3056576Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3056731Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3057005Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3057149Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3057424Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3057572Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3058037Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968. 2025-12-04T13:38:32.3058151Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3058346Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3058698Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3058810Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3059021Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3059185Z [rank0]:E1204 13:37:16.407000 433764 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.3059222Z dist init r=0, world=4 2025-12-04T13:38:32.3059358Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3059528Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3059851Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3060004Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3060287Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3060409Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3060687Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3060833Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3061121Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3061285Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3061559Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3061696Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3061976Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3062122Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3062588Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872. 2025-12-04T13:38:32.3062714Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3062907Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3063249Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3063361Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3063572Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3063735Z [rank2]:E1204 13:37:16.443000 433766 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.3063786Z dist init r=2, world=4 2025-12-04T13:38:32.3064124Z [rank0]:[W1204 13:37:16.662245851 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.3064162Z FAILED [11.1222s] [100%] 2025-12-04T13:38:32.3064164Z 2025-12-04T13:38:32.3064220Z =================================== FAILURES =================================== 2025-12-04T13:38:32.3064315Z ________ TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda ________ 2025-12-04T13:38:32.3064360Z Traceback (most recent call last): 2025-12-04T13:38:32.3064522Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.3064566Z self._join_processes(fn) 2025-12-04T13:38:32.3064739Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.3064792Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.3064979Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.3065022Z raise RuntimeError(error) 2025-12-04T13:38:32.3065099Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3065156Z Traceback (most recent call last): 2025-12-04T13:38:32.3065315Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3065357Z getattr(self, test_name)() 2025-12-04T13:38:32.3065513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3065548Z fn() 2025-12-04T13:38:32.3065699Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3065739Z method(*args, **kwargs) 2025-12-04T13:38:32.3065889Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3065929Z method(*args, **kwargs) 2025-12-04T13:38:32.3066080Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3066117Z with policy(): 2025-12-04T13:38:32.3066271Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3066310Z raise RuntimeError(msg) 2025-12-04T13:38:32.3066656Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:38:32.3066668Z 2025-12-04T13:38:32.3066742Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3066957Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3066959Z 2025-12-04T13:38:32.3067044Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3067047Z 2025-12-04T13:38:32.3067049Z 2025-12-04T13:38:32.3067122Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3067208Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3067443Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-761ce9b29c6b03b5.xml - 2025-12-04T13:38:32.3067514Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3067748Z FAILED [11.1222s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3067794Z Traceback (most recent call last): 2025-12-04T13:38:32.3067957Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3068000Z getattr(self, test_name)() 2025-12-04T13:38:32.3068160Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3068194Z fn() 2025-12-04T13:38:32.3068345Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3068385Z method(*args, **kwargs) 2025-12-04T13:38:32.3068537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3068576Z method(*args, **kwargs) 2025-12-04T13:38:32.3068727Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3068775Z with policy(): 2025-12-04T13:38:32.3068928Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3068980Z raise RuntimeError(msg) 2025-12-04T13:38:32.3069325Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224. 2025-12-04T13:38:32.3069330Z 2025-12-04T13:38:32.3069402Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3069654Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3069656Z 2025-12-04T13:38:32.3069741Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3069804Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3069865Z ====================== 1 failed, 32 deselected in 11.27s ======================= 2025-12-04T13:38:32.3069902Z Got exit code 1 2025-12-04T13:38:32.3070067Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda 2025-12-04T13:38:32.3070195Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.3070395Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-96f31bdbacda79ff.xml 2025-12-04T13:38:32.3070452Z ============================= test session starts ============================== 2025-12-04T13:38:32.3070563Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3070603Z cachedir: .pytest_cache 2025-12-04T13:38:32.3070760Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3070806Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3070845Z configfile: pytest.ini 2025-12-04T13:38:32.3071008Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3071079Z collecting ... collected 60 items / 31 deselected / 29 selected 2025-12-04T13:38:32.3071132Z stepcurrent: skipping 31 already run items. 2025-12-04T13:38:32.3071174Z Running 2 items in this shard 2025-12-04T13:38:32.3071177Z 2025-12-04T13:38:32.3071498Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda I1204 13:37:21.064000 434097 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 434166 2025-12-04T13:38:32.3071655Z I1204 13:37:21.065000 434097 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 434167 2025-12-04T13:38:32.3071806Z I1204 13:37:21.065000 434097 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 434168 2025-12-04T13:38:32.3071957Z I1204 13:37:21.066000 434097 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 434169 2025-12-04T13:38:32.3072316Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3072365Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3072729Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3072775Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3073125Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3073181Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3073533Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3073576Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3074161Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3074199Z _warn_cpu_init() 2025-12-04T13:38:32.3074782Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3074834Z _warn_cpu_init() 2025-12-04T13:38:32.3075401Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3075438Z _warn_cpu_init() 2025-12-04T13:38:32.3075738Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.3075782Z return func(*args, **kwargs) 2025-12-04T13:38:32.3076353Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3076391Z _warn_cpu_init() 2025-12-04T13:38:32.3076533Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3076693Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3076984Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3077153Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3077437Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3077571Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3077847Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3077997Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3078274Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3078422Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3078696Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3078832Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3079119Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3079267Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3079771Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T13:38:32.3079887Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3080098Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3080459Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3080572Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3080783Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3080946Z [rank3]:E1204 13:37:30.653000 434169 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.3080985Z dist init r=3, world=4 2025-12-04T13:38:32.3081123Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3081283Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3081583Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3081747Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3082030Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3082155Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3082430Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3082577Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3082856Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3083003Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3083292Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3083427Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3083704Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3083851Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3084333Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3879731200. 2025-12-04T13:38:32.3084449Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3084642Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3084998Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3085111Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3085323Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3085486Z [rank2]:E1204 13:37:30.654000 434168 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.3085523Z dist init r=2, world=4 2025-12-04T13:38:32.3085670Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3085828Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3086122Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3086274Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3086559Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3086681Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3086959Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3087105Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3087383Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3087540Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3087815Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3087951Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3088226Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3088373Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3088858Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T13:38:32.3088971Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3089166Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3089517Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3089665Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3089888Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3090052Z [rank0]:E1204 13:37:30.701000 434166 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.3090102Z dist init r=0, world=4 2025-12-04T13:38:32.3090238Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3090396Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3090682Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3090833Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3091116Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3091240Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3091514Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3091676Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3091951Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3092099Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3092375Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3092511Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3092807Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3092953Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3093429Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 3896508416. 2025-12-04T13:38:32.3093543Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3093736Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3094093Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3094215Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3094425Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3094599Z [rank1]:E1204 13:37:30.702000 434167 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.3094637Z dist init r=1, world=4 2025-12-04T13:38:32.3094973Z [rank0]:[W1204 13:37:31.081157773 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.3095013Z FAILED [11.5241s] [ 50%] 2025-12-04T13:38:32.3095015Z 2025-12-04T13:38:32.3095072Z =================================== FAILURES =================================== 2025-12-04T13:38:32.3095170Z ____ TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda ____ 2025-12-04T13:38:32.3095216Z Traceback (most recent call last): 2025-12-04T13:38:32.3095380Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.3095423Z self._join_processes(fn) 2025-12-04T13:38:32.3095594Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.3095658Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.3095835Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.3095878Z raise RuntimeError(error) 2025-12-04T13:38:32.3095956Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3096001Z Traceback (most recent call last): 2025-12-04T13:38:32.3096161Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3096203Z getattr(self, test_name)() 2025-12-04T13:38:32.3096362Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3096396Z fn() 2025-12-04T13:38:32.3096548Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3096589Z method(*args, **kwargs) 2025-12-04T13:38:32.3096749Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3096790Z method(*args, **kwargs) 2025-12-04T13:38:32.3096940Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3096977Z with policy(): 2025-12-04T13:38:32.3097129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3097170Z raise RuntimeError(msg) 2025-12-04T13:38:32.3097520Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T13:38:32.3097523Z 2025-12-04T13:38:32.3097597Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3097826Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3097828Z 2025-12-04T13:38:32.3097929Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3097931Z 2025-12-04T13:38:32.3097933Z 2025-12-04T13:38:32.3098008Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3098105Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3098338Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-96f31bdbacda79ff.xml - 2025-12-04T13:38:32.3098397Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3098644Z FAILED [11.5241s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3098690Z Traceback (most recent call last): 2025-12-04T13:38:32.3098855Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3098898Z getattr(self, test_name)() 2025-12-04T13:38:32.3099058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3099093Z fn() 2025-12-04T13:38:32.3099243Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3099283Z method(*args, **kwargs) 2025-12-04T13:38:32.3099433Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3099482Z method(*args, **kwargs) 2025-12-04T13:38:32.3099666Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3099703Z with policy(): 2025-12-04T13:38:32.3099854Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3099894Z raise RuntimeError(msg) 2025-12-04T13:38:32.3100245Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T13:38:32.3100248Z 2025-12-04T13:38:32.3100323Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3100563Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3100567Z 2025-12-04T13:38:32.3100654Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3100716Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3100777Z ====================== 1 failed, 31 deselected in 11.69s ======================= 2025-12-04T13:38:32.3100814Z Got exit code 1 2025-12-04T13:38:32.3100853Z Retrying single test... 2025-12-04T13:38:32.3101046Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4c42c4f7d061028c.xml 2025-12-04T13:38:32.3101102Z ============================= test session starts ============================== 2025-12-04T13:38:32.3101214Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3101255Z cachedir: .pytest_cache 2025-12-04T13:38:32.3101413Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3101458Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3101497Z configfile: pytest.ini 2025-12-04T13:38:32.3101674Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3101748Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.3101967Z stepcurrent: skipping 31 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3102024Z Running 1 items in this shard 2025-12-04T13:38:32.3102026Z 2025-12-04T13:38:32.3102331Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda I1204 13:37:35.304000 434499 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 434568 2025-12-04T13:38:32.3102488Z I1204 13:37:35.305000 434499 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 434569 2025-12-04T13:38:32.3102639Z I1204 13:37:35.305000 434499 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 434570 2025-12-04T13:38:32.3102790Z I1204 13:37:35.306000 434499 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 434571 2025-12-04T13:38:32.3103155Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3103202Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3103557Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3103616Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3103971Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3104016Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3104365Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3104409Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3104996Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3105034Z _warn_cpu_init() 2025-12-04T13:38:32.3105609Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3105648Z _warn_cpu_init() 2025-12-04T13:38:32.3106227Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3108672Z _warn_cpu_init() 2025-12-04T13:38:32.3108969Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.3109013Z return func(*args, **kwargs) 2025-12-04T13:38:32.3109620Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3109659Z _warn_cpu_init() 2025-12-04T13:38:32.3109803Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3109967Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3110264Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3110452Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3110737Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3110861Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3111140Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3111289Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3111585Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3111732Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3112006Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3112143Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3112420Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3112570Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3113067Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T13:38:32.3113197Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3113391Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3113746Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3113861Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3114073Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3114238Z [rank3]:E1204 13:37:45.090000 434571 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.3114276Z dist init r=3, world=4 2025-12-04T13:38:32.3114413Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3114572Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3114872Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3115027Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3115309Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3115435Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3115720Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3115868Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3116142Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3116289Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3116566Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3116701Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3116984Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3117141Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3117616Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 3879731200. 2025-12-04T13:38:32.3117739Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3117935Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3118292Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3118403Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3118615Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3118777Z [rank2]:E1204 13:37:45.120000 434570 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.3118824Z dist init r=2, world=4 2025-12-04T13:38:32.3118960Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3119119Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3119405Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3119559Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3119875Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3120014Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3120292Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3120438Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3120717Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3120861Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3121139Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3121273Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3121564Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3121727Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3122204Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T13:38:32.3122318Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3122512Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3122867Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3122980Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3123202Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3123365Z [rank0]:E1204 13:37:45.129000 434568 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.3123403Z dist init r=0, world=4 2025-12-04T13:38:32.3123540Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3123697Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3123986Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3124139Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3124432Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3124557Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3124829Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3124976Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3125250Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3125398Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3125682Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3125819Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3126111Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3126258Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3126735Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 3896508416. 2025-12-04T13:38:32.3126848Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3127043Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3127396Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3127517Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3127728Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3127891Z [rank1]:E1204 13:37:45.134000 434569 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.3127930Z dist init r=1, world=4 2025-12-04T13:38:32.3128265Z [rank0]:[W1204 13:37:45.413622202 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.3128306Z FAILED [11.8208s] [100%] 2025-12-04T13:38:32.3128308Z 2025-12-04T13:38:32.3128373Z =================================== FAILURES =================================== 2025-12-04T13:38:32.3128473Z ____ TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda ____ 2025-12-04T13:38:32.3128518Z Traceback (most recent call last): 2025-12-04T13:38:32.3128682Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.3128724Z self._join_processes(fn) 2025-12-04T13:38:32.3128897Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.3128951Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.3129129Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.3129174Z raise RuntimeError(error) 2025-12-04T13:38:32.3129251Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3129297Z Traceback (most recent call last): 2025-12-04T13:38:32.3129458Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3129499Z getattr(self, test_name)() 2025-12-04T13:38:32.3129709Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3129744Z fn() 2025-12-04T13:38:32.3129910Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3129951Z method(*args, **kwargs) 2025-12-04T13:38:32.3130101Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3130140Z method(*args, **kwargs) 2025-12-04T13:38:32.3130293Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3130329Z with policy(): 2025-12-04T13:38:32.3130570Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3130612Z raise RuntimeError(msg) 2025-12-04T13:38:32.3130960Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T13:38:32.3130963Z 2025-12-04T13:38:32.3131039Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3131267Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3131283Z 2025-12-04T13:38:32.3131371Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3131373Z 2025-12-04T13:38:32.3131375Z 2025-12-04T13:38:32.3131451Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3131537Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3131768Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4c42c4f7d061028c.xml - 2025-12-04T13:38:32.3131828Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3132072Z FAILED [11.8208s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3132116Z Traceback (most recent call last): 2025-12-04T13:38:32.3132295Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3132337Z getattr(self, test_name)() 2025-12-04T13:38:32.3132497Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3132530Z fn() 2025-12-04T13:38:32.3132681Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3132720Z method(*args, **kwargs) 2025-12-04T13:38:32.3132872Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3132911Z method(*args, **kwargs) 2025-12-04T13:38:32.3133060Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3133097Z with policy(): 2025-12-04T13:38:32.3133249Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3133290Z raise RuntimeError(msg) 2025-12-04T13:38:32.3133651Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T13:38:32.3133654Z 2025-12-04T13:38:32.3133739Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3133968Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3133971Z 2025-12-04T13:38:32.3134056Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3134118Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3134182Z ====================== 1 failed, 32 deselected in 11.98s ======================= 2025-12-04T13:38:32.3134218Z Got exit code 1 2025-12-04T13:38:32.3134257Z Retrying single test... 2025-12-04T13:38:32.3134446Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0af34500f77d76fa.xml 2025-12-04T13:38:32.3134505Z ============================= test session starts ============================== 2025-12-04T13:38:32.3134621Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3134661Z cachedir: .pytest_cache 2025-12-04T13:38:32.3134820Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3134865Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3134914Z configfile: pytest.ini 2025-12-04T13:38:32.3135082Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3135157Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.3135377Z stepcurrent: skipping 31 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3135419Z Running 1 items in this shard 2025-12-04T13:38:32.3135422Z 2025-12-04T13:38:32.3135725Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda I1204 13:37:49.642000 434901 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 434970 2025-12-04T13:38:32.3135882Z I1204 13:37:49.643000 434901 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 434971 2025-12-04T13:38:32.3136045Z I1204 13:37:49.643000 434901 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 434972 2025-12-04T13:38:32.3136196Z I1204 13:37:49.644000 434901 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 434973 2025-12-04T13:38:32.3136558Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3136606Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3136958Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3137003Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3137356Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3137399Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3137760Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3137815Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3138390Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3138428Z _warn_cpu_init() 2025-12-04T13:38:32.3138719Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning. 2025-12-04T13:38:32.3138761Z return func(*args, **kwargs) 2025-12-04T13:38:32.3139336Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3139383Z _warn_cpu_init() 2025-12-04T13:38:32.3139992Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3140028Z _warn_cpu_init() 2025-12-04T13:38:32.3140605Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication. 2025-12-04T13:38:32.3140641Z _warn_cpu_init() 2025-12-04T13:38:32.3140784Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3140944Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3141233Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3141388Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3141672Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3141797Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3142090Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3142240Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3142528Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3142676Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3142950Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3143086Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3143363Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3143510Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3143987Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3896508416. 2025-12-04T13:38:32.3144116Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3144313Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3144673Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3144787Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3145010Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3145174Z [rank1]:E1204 13:37:59.206000 434971 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.3145213Z dist init r=1, world=4 2025-12-04T13:38:32.3145351Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3145510Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3145794Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3145949Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3146244Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3146367Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3146652Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3146798Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3147075Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3147221Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3147496Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3147634Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3147909Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3148068Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3148540Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 3879731200. 2025-12-04T13:38:32.3148656Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3148850Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3149218Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3149332Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3149543Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3149748Z [rank2]:E1204 13:37:59.208000 434972 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.3149784Z dist init r=2, world=4 2025-12-04T13:38:32.3149920Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3150081Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3150368Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3150534Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3150817Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3150953Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3151229Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3151379Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3151658Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3151805Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3152081Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3152231Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3152508Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3152654Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3153125Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 114176 on device 3. CUDA driver allocated memory was 2250244096 and is now 3829399552. 2025-12-04T13:38:32.3153238Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3153445Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3153802Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3153914Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3154124Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3154286Z [rank3]:E1204 13:37:59.271000 434973 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.3154326Z dist init r=3, world=4 2025-12-04T13:38:32.3154461Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3154624Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3154919Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3155081Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3155364Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3155488Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3155766Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3155912Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3156190Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3156335Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3156623Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3156758Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3157034Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3157182Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3157668Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 0. CUDA driver allocated memory was 2453667840 and is now 4032823296. 2025-12-04T13:38:32.3157783Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3157976Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3158329Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3158442Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3158654Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3158816Z [rank0]:E1204 13:37:59.288000 434970 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.3158853Z dist init r=0, world=4 2025-12-04T13:38:32.3159198Z [rank0]:[W1204 13:37:59.676316395 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) 2025-12-04T13:38:32.3159246Z FAILED [11.4239s] [100%] 2025-12-04T13:38:32.3159249Z 2025-12-04T13:38:32.3159305Z =================================== FAILURES =================================== 2025-12-04T13:38:32.3159403Z ____ TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda ____ 2025-12-04T13:38:32.3159450Z Traceback (most recent call last): 2025-12-04T13:38:32.3159649Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.3159692Z self._join_processes(fn) 2025-12-04T13:38:32.3159864Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.3159917Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.3160094Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.3160138Z raise RuntimeError(error) 2025-12-04T13:38:32.3160218Z RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.3160263Z Traceback (most recent call last): 2025-12-04T13:38:32.3160425Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3160484Z getattr(self, test_name)() 2025-12-04T13:38:32.3160642Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3160675Z fn() 2025-12-04T13:38:32.3160830Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3160869Z method(*args, **kwargs) 2025-12-04T13:38:32.3161021Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3161061Z method(*args, **kwargs) 2025-12-04T13:38:32.3161211Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3161247Z with policy(): 2025-12-04T13:38:32.3161400Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3161454Z raise RuntimeError(msg) 2025-12-04T13:38:32.3161801Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 3879731200. 2025-12-04T13:38:32.3161804Z 2025-12-04T13:38:32.3161878Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3162107Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3162110Z 2025-12-04T13:38:32.3162196Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3162198Z 2025-12-04T13:38:32.3162200Z 2025-12-04T13:38:32.3162274Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3162362Z Process 2 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3162593Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0af34500f77d76fa.xml - 2025-12-04T13:38:32.3162664Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3162909Z FAILED [11.4239s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.3162966Z Traceback (most recent call last): 2025-12-04T13:38:32.3163131Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3163173Z getattr(self, test_name)() 2025-12-04T13:38:32.3163332Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3163368Z fn() 2025-12-04T13:38:32.3163519Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3163558Z method(*args, **kwargs) 2025-12-04T13:38:32.3163710Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3163749Z method(*args, **kwargs) 2025-12-04T13:38:32.3163901Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3163937Z with policy(): 2025-12-04T13:38:32.3164089Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3164128Z raise RuntimeError(msg) 2025-12-04T13:38:32.3164487Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 3879731200. 2025-12-04T13:38:32.3164490Z 2025-12-04T13:38:32.3164563Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3164790Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3164794Z 2025-12-04T13:38:32.3164879Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3164942Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3165003Z ====================== 1 failed, 32 deselected in 11.58s ======================= 2025-12-04T13:38:32.3165040Z Got exit code 1 2025-12-04T13:38:32.3165230Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda 2025-12-04T13:38:32.3165359Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.3165550Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a024f07c1aa82907.xml 2025-12-04T13:38:32.3165607Z ============================= test session starts ============================== 2025-12-04T13:38:32.3165719Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3165762Z cachedir: .pytest_cache 2025-12-04T13:38:32.3165920Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3165967Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3166006Z configfile: pytest.ini 2025-12-04T13:38:32.3166172Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3166243Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.3166296Z stepcurrent: skipping 32 already run items. 2025-12-04T13:38:32.3166338Z Running 1 items in this shard 2025-12-04T13:38:32.3166340Z 2025-12-04T13:38:32.3166645Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:38:03.779000 435303 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 435372 2025-12-04T13:38:32.3166809Z I1204 13:38:03.780000 435303 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 435373 2025-12-04T13:38:32.3166962Z I1204 13:38:03.780000 435303 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 435374 2025-12-04T13:38:32.3167113Z I1204 13:38:03.781000 435303 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 435375 2025-12-04T13:38:32.3167478Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3167527Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3167817Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3167883Z {} 2025-12-04T13:38:32.3167987Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3168060Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3168569Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3168632Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3168988Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3169034Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3169338Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3169400Z {} 2025-12-04T13:38:32.3169505Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3169620Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3170117Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3170177Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3170533Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3170580Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3170878Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3170939Z {} 2025-12-04T13:38:32.3171053Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3171124Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3171615Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3171675Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3172032Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3172078Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3172368Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3172427Z {} 2025-12-04T13:38:32.3172547Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3172617Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3173104Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3173163Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3173307Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3173469Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3173777Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3173931Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3174223Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3174349Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3174630Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3174782Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3175070Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3175217Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3175502Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3175639Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3175920Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3176068Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3176538Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:38:32.3176652Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3176849Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3177207Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3177319Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3177533Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3177696Z [rank3]:E1204 13:38:09.408000 435375 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.3177734Z dist init r=3, world=4 2025-12-04T13:38:32.3177882Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3178042Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3178329Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3178484Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3178768Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3178892Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3179170Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3179327Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3179646Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3179805Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3180080Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3180216Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3180496Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3180644Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3181108Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184. 2025-12-04T13:38:32.3181236Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3181432Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3181778Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3181892Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3182101Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3182278Z [rank2]:E1204 13:38:09.424000 435374 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.3182315Z dist init r=2, world=4 2025-12-04T13:38:32.3182453Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3182610Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3182897Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3183049Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3183334Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3183457Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3183743Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3183900Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3184176Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3184325Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3184603Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3184741Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3185018Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3185166Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3185642Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:38:32.3185755Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3185949Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3186296Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3186422Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3186632Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3186798Z [rank1]:E1204 13:38:09.435000 435373 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.3186835Z dist init r=1, world=4 2025-12-04T13:38:32.3186972Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3187133Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3187420Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3187574Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3187868Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3187991Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3188283Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3188428Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3188710Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3188858Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3189134Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3189268Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3189548Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3189740Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3190204Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280. 2025-12-04T13:38:32.3190319Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3190512Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3190872Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3190986Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3191200Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3191366Z [rank0]:E1204 13:38:09.459000 435372 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.3191403Z dist init r=0, world=4 2025-12-04T13:38:32.3191442Z FAILED [6.7165s] [100%] 2025-12-04T13:38:32.3191444Z 2025-12-04T13:38:32.3191501Z =================================== FAILURES =================================== 2025-12-04T13:38:32.3191598Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______ 2025-12-04T13:38:32.3191643Z Traceback (most recent call last): 2025-12-04T13:38:32.3191805Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.3191860Z self._join_processes(fn) 2025-12-04T13:38:32.3192033Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.3192099Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.3192278Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.3192321Z raise RuntimeError(error) 2025-12-04T13:38:32.3192400Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3192444Z Traceback (most recent call last): 2025-12-04T13:38:32.3192606Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3192647Z getattr(self, test_name)() 2025-12-04T13:38:32.3192807Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3192841Z fn() 2025-12-04T13:38:32.3192993Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3193033Z method(*args, **kwargs) 2025-12-04T13:38:32.3193187Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3193226Z method(*args, **kwargs) 2025-12-04T13:38:32.3193378Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3193426Z with policy(): 2025-12-04T13:38:32.3193580Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3193620Z raise RuntimeError(msg) 2025-12-04T13:38:32.3193961Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:38:32.3193965Z 2025-12-04T13:38:32.3194040Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3194258Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3194260Z 2025-12-04T13:38:32.3194349Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3194351Z 2025-12-04T13:38:32.3194353Z 2025-12-04T13:38:32.3194438Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3194525Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3194760Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a024f07c1aa82907.xml - 2025-12-04T13:38:32.3194822Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3195058Z FAILED [6.7165s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3195102Z Traceback (most recent call last): 2025-12-04T13:38:32.3195266Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3195309Z getattr(self, test_name)() 2025-12-04T13:38:32.3195471Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3195505Z fn() 2025-12-04T13:38:32.3195670Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3195708Z method(*args, **kwargs) 2025-12-04T13:38:32.3195859Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3195908Z method(*args, **kwargs) 2025-12-04T13:38:32.3196058Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3196093Z with policy(): 2025-12-04T13:38:32.3196246Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3196286Z raise RuntimeError(msg) 2025-12-04T13:38:32.3196627Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:38:32.3196630Z 2025-12-04T13:38:32.3196702Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3196921Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3196924Z 2025-12-04T13:38:32.3197011Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3197073Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3197147Z ======================= 1 failed, 32 deselected in 6.88s ======================= 2025-12-04T13:38:32.3197185Z Got exit code 1 2025-12-04T13:38:32.3197225Z Retrying single test... 2025-12-04T13:38:32.3197413Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f72400cd29f545d5.xml 2025-12-04T13:38:32.3197474Z ============================= test session starts ============================== 2025-12-04T13:38:32.3197586Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3197628Z cachedir: .pytest_cache 2025-12-04T13:38:32.3197787Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3197833Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3197871Z configfile: pytest.ini 2025-12-04T13:38:32.3198035Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3198121Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.3198332Z stepcurrent: skipping 32 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3198374Z Running 1 items in this shard 2025-12-04T13:38:32.3198376Z 2025-12-04T13:38:32.3198670Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:38:13.063000 435697 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 435766 2025-12-04T13:38:32.3198826Z I1204 13:38:13.063000 435697 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 435767 2025-12-04T13:38:32.3198977Z I1204 13:38:13.064000 435697 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 435768 2025-12-04T13:38:32.3199130Z I1204 13:38:13.064000 435697 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 435769 2025-12-04T13:38:32.3199499Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3199547Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3199871Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3199950Z {} 2025-12-04T13:38:32.3200052Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3200126Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3200619Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3200678Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3201035Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3201081Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3201370Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3201442Z {} 2025-12-04T13:38:32.3201546Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3201616Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3202104Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3202164Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3202531Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3202579Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3202865Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3202925Z {} 2025-12-04T13:38:32.3203027Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3203099Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3203586Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3203646Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3204017Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3204071Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3204355Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3204415Z {} 2025-12-04T13:38:32.3204519Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3204589Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3205080Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3205137Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3205280Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3205441Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3205741Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3205898Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3206182Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3206308Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3206584Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3206743Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3207020Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3207168Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3207444Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3207579Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3207857Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3208015Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3208481Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184. 2025-12-04T13:38:32.3208608Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3208804Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3209150Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3209262Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3209478Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3209679Z [rank2]:E1204 13:38:18.652000 435768 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.3209733Z dist init r=2, world=4 2025-12-04T13:38:32.3209871Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3210031Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3210317Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3210470Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3210755Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3210891Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3211170Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3211316Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3211595Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3211744Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3212020Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3212157Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3212444Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3212603Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3213066Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:38:32.3213180Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3213376Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3213719Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3213834Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3214046Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3214223Z [rank1]:E1204 13:38:18.653000 435767 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.3214262Z dist init r=1, world=4 2025-12-04T13:38:32.3214399Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3214558Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3214848Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3215002Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3215297Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3215421Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3215699Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3215846Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3216120Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3216269Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3216557Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3216692Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3216983Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3217129Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3217595Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:38:32.3217707Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3217902Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3218243Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3218366Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3218577Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3218741Z [rank3]:E1204 13:38:18.698000 435769 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.3218779Z dist init r=3, world=4 2025-12-04T13:38:32.3218916Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3219076Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3219378Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3219532Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3219860Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3219983Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3220258Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3220406Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3220683Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3220842Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3221118Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3221270Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3221547Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3221696Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3222156Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280. 2025-12-04T13:38:32.3222271Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3222464Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3222819Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3222933Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3223143Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3223309Z [rank0]:E1204 13:38:18.704000 435766 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.3223346Z dist init r=0, world=4 2025-12-04T13:38:32.3223384Z FAILED [6.6161s] [100%] 2025-12-04T13:38:32.3223386Z 2025-12-04T13:38:32.3223442Z =================================== FAILURES =================================== 2025-12-04T13:38:32.3223547Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______ 2025-12-04T13:38:32.3223593Z Traceback (most recent call last): 2025-12-04T13:38:32.3223757Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.3223800Z self._join_processes(fn) 2025-12-04T13:38:32.3223973Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.3224026Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.3224203Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.3224245Z raise RuntimeError(error) 2025-12-04T13:38:32.3224322Z RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.3224367Z Traceback (most recent call last): 2025-12-04T13:38:32.3224528Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3224570Z getattr(self, test_name)() 2025-12-04T13:38:32.3224738Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3224773Z fn() 2025-12-04T13:38:32.3224923Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3224975Z method(*args, **kwargs) 2025-12-04T13:38:32.3225124Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3225164Z method(*args, **kwargs) 2025-12-04T13:38:32.3225313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3225350Z with policy(): 2025-12-04T13:38:32.3225502Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3225543Z raise RuntimeError(msg) 2025-12-04T13:38:32.3225884Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:38:32.3225887Z 2025-12-04T13:38:32.3225962Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3226178Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3226180Z 2025-12-04T13:38:32.3226278Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3226280Z 2025-12-04T13:38:32.3226339Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.3226383Z Traceback (most recent call last): 2025-12-04T13:38:32.3226545Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3226586Z getattr(self, test_name)() 2025-12-04T13:38:32.3226744Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3226779Z fn() 2025-12-04T13:38:32.3226931Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3226969Z method(*args, **kwargs) 2025-12-04T13:38:32.3227119Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3227157Z method(*args, **kwargs) 2025-12-04T13:38:32.3227320Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3227356Z with policy(): 2025-12-04T13:38:32.3227508Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3227548Z raise RuntimeError(msg) 2025-12-04T13:38:32.3227888Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184. 2025-12-04T13:38:32.3227891Z 2025-12-04T13:38:32.3227963Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3228178Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3228181Z 2025-12-04T13:38:32.3228268Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3228270Z 2025-12-04T13:38:32.3228272Z 2025-12-04T13:38:32.3228346Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3228443Z Process 1 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3228676Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f72400cd29f545d5.xml - 2025-12-04T13:38:32.3228746Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3228979Z FAILED [6.6161s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception: 2025-12-04T13:38:32.3229025Z Traceback (most recent call last): 2025-12-04T13:38:32.3229190Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3229232Z getattr(self, test_name)() 2025-12-04T13:38:32.3229392Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3229427Z fn() 2025-12-04T13:38:32.3229684Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3229725Z method(*args, **kwargs) 2025-12-04T13:38:32.3229876Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3229915Z method(*args, **kwargs) 2025-12-04T13:38:32.3230066Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3230120Z with policy(): 2025-12-04T13:38:32.3230274Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3230313Z raise RuntimeError(msg) 2025-12-04T13:38:32.3230657Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:38:32.3230660Z 2025-12-04T13:38:32.3230731Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3230946Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3230948Z 2025-12-04T13:38:32.3231033Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3231037Z 2025-12-04T13:38:32.3231108Z Process 2 exited with error code 10 and exception: 2025-12-04T13:38:32.3231151Z Traceback (most recent call last): 2025-12-04T13:38:32.3231313Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3231353Z getattr(self, test_name)() 2025-12-04T13:38:32.3231513Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3231548Z fn() 2025-12-04T13:38:32.3231697Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3231736Z method(*args, **kwargs) 2025-12-04T13:38:32.3231884Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3231924Z method(*args, **kwargs) 2025-12-04T13:38:32.3232074Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3232109Z with policy(): 2025-12-04T13:38:32.3232260Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3232300Z raise RuntimeError(msg) 2025-12-04T13:38:32.3232653Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184. 2025-12-04T13:38:32.3232669Z 2025-12-04T13:38:32.3232742Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3232956Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3232959Z 2025-12-04T13:38:32.3233045Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3233108Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3233169Z ======================= 1 failed, 32 deselected in 6.78s ======================= 2025-12-04T13:38:32.3233205Z Got exit code 1 2025-12-04T13:38:32.3233245Z Retrying single test... 2025-12-04T13:38:32.3233434Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-bafa624f55ba28c7.xml 2025-12-04T13:38:32.3233491Z ============================= test session starts ============================== 2025-12-04T13:38:32.3233603Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3233643Z cachedir: .pytest_cache 2025-12-04T13:38:32.3233803Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3233869Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3233909Z configfile: pytest.ini 2025-12-04T13:38:32.3234071Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3234145Z collecting ... collected 60 items / 32 deselected / 28 selected 2025-12-04T13:38:32.3234357Z stepcurrent: skipping 32 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3234401Z Running 1 items in this shard 2025-12-04T13:38:32.3234403Z 2025-12-04T13:38:32.3234695Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:38:22.210000 436091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 436160 2025-12-04T13:38:32.3234862Z I1204 13:38:22.211000 436091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 436161 2025-12-04T13:38:32.3235014Z I1204 13:38:22.211000 436091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 436162 2025-12-04T13:38:32.3235164Z I1204 13:38:22.212000 436091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 436163 2025-12-04T13:38:32.3235523Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3235570Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3235862Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3235925Z {} 2025-12-04T13:38:32.3236029Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3236101Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3236609Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3236682Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3237037Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3237085Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3237373Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3237435Z {} 2025-12-04T13:38:32.3237537Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3237609Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3238095Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3238167Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3238519Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3238565Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3238850Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3238909Z {} 2025-12-04T13:38:32.3239011Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3239093Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3239622Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3239680Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3240037Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance) 2025-12-04T13:38:32.3240081Z self.encoder = TransformerEncoder( 2025-12-04T13:38:32.3240371Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type: 2025-12-04T13:38:32.3240431Z {} 2025-12-04T13:38:32.3240545Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled. 2025-12-04T13:38:32.3240616Z _warn_on_overridden_mixed_precision(overridden_module_classes) 2025-12-04T13:38:32.3241101Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument. 2025-12-04T13:38:32.3241172Z device_from_device_id = _get_device_from_device_id( 2025-12-04T13:38:32.3241317Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3241478Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3241770Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3241928Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3242215Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3242352Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3242630Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3242778Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3243058Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3243204Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3243491Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3243628Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3243904Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3244053Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3244519Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:38:32.3244636Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3244841Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3245188Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3245312Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3245522Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3245688Z [rank3]:E1204 13:38:27.865000 436163 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 3 with exit code: 10 2025-12-04T13:38:32.3245726Z dist init r=3, world=4 2025-12-04T13:38:32.3245864Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3246022Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3246311Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3246464Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3246767Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3246892Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3247169Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3247317Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3247592Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3247751Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3248025Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3248161Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3248439Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3248586Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3249066Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280. 2025-12-04T13:38:32.3249179Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3249392Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3249755Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3249870Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3250082Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3250246Z [rank0]:E1204 13:38:27.883000 436160 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 0 with exit code: 10 2025-12-04T13:38:32.3250286Z dist init r=0, world=4 2025-12-04T13:38:32.3250422Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3250580Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3250865Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3251033Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3251317Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3251440Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3251714Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3251872Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3252150Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3252296Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3252572Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3252707Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3252985Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3253135Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3253611Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184. 2025-12-04T13:38:32.3253736Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3253930Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3254272Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3254384Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3254595Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3254759Z [rank2]:E1204 13:38:27.895000 436162 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 2 with exit code: 10 2025-12-04T13:38:32.3254797Z dist init r=2, world=4 2025-12-04T13:38:32.3254933Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 2025-12-04T13:38:32.3255104Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last): 2025-12-04T13:38:32.3255391Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3255543Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] getattr(self, test_name)() 2025-12-04T13:38:32.3255829Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3255951Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] fn() 2025-12-04T13:38:32.3256239Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3256386Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3256663Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3256810Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] method(*args, **kwargs) 2025-12-04T13:38:32.3257085Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3257222Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] with policy(): 2025-12-04T13:38:32.3257507Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3257654Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] raise RuntimeError(msg) 2025-12-04T13:38:32.3258129Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400. 2025-12-04T13:38:32.3258244Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3258438Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3258778Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3258891Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] 2025-12-04T13:38:32.3259100Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3259276Z [rank1]:E1204 13:38:27.924000 436161 site-packages/torch/testing/_internal/common_distributed.py:935] exiting process 1 with exit code: 10 2025-12-04T13:38:32.3259314Z dist init r=1, world=4 2025-12-04T13:38:32.3259352Z FAILED [6.6151s] [100%] 2025-12-04T13:38:32.3259354Z 2025-12-04T13:38:32.3259409Z =================================== FAILURES =================================== 2025-12-04T13:38:32.3259503Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______ 2025-12-04T13:38:32.3259549Z Traceback (most recent call last): 2025-12-04T13:38:32.3259751Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper 2025-12-04T13:38:32.3259794Z self._join_processes(fn) 2025-12-04T13:38:32.3259965Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes 2025-12-04T13:38:32.3260019Z self._check_return_codes(fn, elapsed_time) 2025-12-04T13:38:32.3260210Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes 2025-12-04T13:38:32.3260253Z raise RuntimeError(error) 2025-12-04T13:38:32.3260331Z RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3260376Z Traceback (most recent call last): 2025-12-04T13:38:32.3260537Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3260578Z getattr(self, test_name)() 2025-12-04T13:38:32.3260736Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3260770Z fn() 2025-12-04T13:38:32.3260921Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3260961Z method(*args, **kwargs) 2025-12-04T13:38:32.3261113Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3261153Z method(*args, **kwargs) 2025-12-04T13:38:32.3261304Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3261341Z with policy(): 2025-12-04T13:38:32.3261504Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3261559Z raise RuntimeError(msg) 2025-12-04T13:38:32.3261894Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:38:32.3261896Z 2025-12-04T13:38:32.3261969Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3262186Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3262189Z 2025-12-04T13:38:32.3262274Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3262276Z 2025-12-04T13:38:32.3262279Z 2025-12-04T13:38:32.3262353Z ----------------------------- Captured stdout call ----------------------------- 2025-12-04T13:38:32.3262438Z Process 3 terminated with exit code 10, terminating remaining processes. 2025-12-04T13:38:32.3262678Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-bafa624f55ba28c7.xml - 2025-12-04T13:38:32.3262738Z =========================== short test summary info ============================ 2025-12-04T13:38:32.3262969Z FAILED [6.6151s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception: 2025-12-04T13:38:32.3263027Z Traceback (most recent call last): 2025-12-04T13:38:32.3263191Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test 2025-12-04T13:38:32.3263234Z getattr(self, test_name)() 2025-12-04T13:38:32.3263393Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper 2025-12-04T13:38:32.3263428Z fn() 2025-12-04T13:38:32.3263579Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3263618Z method(*args, **kwargs) 2025-12-04T13:38:32.3263766Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper 2025-12-04T13:38:32.3263806Z method(*args, **kwargs) 2025-12-04T13:38:32.3263963Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper 2025-12-04T13:38:32.3264001Z with policy(): 2025-12-04T13:38:32.3264152Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__ 2025-12-04T13:38:32.3264193Z raise RuntimeError(msg) 2025-12-04T13:38:32.3264528Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536. 2025-12-04T13:38:32.3264532Z 2025-12-04T13:38:32.3264606Z To execute this test, run the following from the base repo dir: 2025-12-04T13:38:32.3264825Z PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3264828Z 2025-12-04T13:38:32.3264915Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 2025-12-04T13:38:32.3264978Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! 2025-12-04T13:38:32.3265036Z ======================= 1 failed, 32 deselected in 6.76s ======================= 2025-12-04T13:38:32.3265088Z Got exit code 1 2025-12-04T13:38:32.3265256Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda 2025-12-04T13:38:32.3265394Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set 2025-12-04T13:38:32.3265584Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b83aefff71d6b746.xml 2025-12-04T13:38:32.3265641Z ============================= test session starts ============================== 2025-12-04T13:38:32.3265755Z platform linux -- Python 3.10.14, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.10/bin/python 2025-12-04T13:38:32.3265796Z cachedir: .pytest_cache 2025-12-04T13:38:32.3265951Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] 2025-12-04T13:38:32.3265997Z rootdir: /var/lib/jenkins/pytorch 2025-12-04T13:38:32.3266036Z configfile: pytest.ini 2025-12-04T13:38:32.3266200Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0 2025-12-04T13:38:32.3266273Z collecting ... collected 60 items / 33 deselected / 27 selected 2025-12-04T13:38:32.3266326Z stepcurrent: skipping 33 already run items. 2025-12-04T13:38:32.3266367Z Running 0 items in this shard 2025-12-04T13:38:32.3266369Z 2025-12-04T13:38:32.3266604Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b83aefff71d6b746.xml - 2025-12-04T13:38:32.3266672Z ============================ 33 deselected in 0.01s ============================ 2025-12-04T13:38:32.3272245Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_False_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_False_mixed_precision_True_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_register_functions_called_cuda_first_True_mixed_precision_False_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda'] 2025-12-04T13:38:32.3272276Z 2025-12-04T13:38:32.3272462Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 1/2 (test/test-reports/distributed.fsdp.test_fsdp_core_1.2_d5d5bc8f8345486d_.log) 2025-12-04T13:38:32.3272464Z 2025-12-04T13:38:32.3272587Z Finished distributed/fsdp/test_fsdp_core 1/2 ... [2025-12-04 13:38:31.931630][5233552.910669318], took 41.64min 2025-12-04T13:38:32.3272855Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:38:32.3272939Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:38:32.3273035Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:38:32.3273091Z Uploading artifacts took 0.00 seconds 2025-12-04T13:38:32.3273142Z distributed/fsdp/test_fsdp_core 1/2 failed! 2025-12-04T13:38:32.3273249Z Running distributed/test_c10d_spawn_ucc 1/1 ... [2025-12-04 13:38:31.934610][5233552.913651199] 2025-12-04T13:38:32.3273297Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:38:32.3273626Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_spawn_ucc.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:38:31.934792] 2025-12-04T13:38:44.6649472Z 2025-12-04T13:38:44.6650102Z distributed/test_c10d_spawn_ucc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_spawn_ucc_1.1_4ecd37b7dc2a6472_.log 2025-12-04T13:38:44.6652316Z Running 6 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_gather, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all_single, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_allreduce, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_broadcast, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_reduce 2025-12-04T13:38:44.6654146Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_gather 2025-12-04T13:38:44.6654753Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all 2025-12-04T13:38:44.6655377Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all_single 2025-12-04T13:38:44.6656004Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_allreduce 2025-12-04T13:38:44.6656597Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_broadcast 2025-12-04T13:38:44.6657179Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_reduce 2025-12-04T13:38:44.6657509Z 2025-12-04T13:38:44.6657757Z Finished distributed/test_c10d_spawn_ucc 1/1 ... [2025-12-04 13:38:44.664747][5233565.643783775], took 0.21min 2025-12-04T13:38:44.6675090Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:38:44.6684633Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:38:44.6686758Z Running distributed/test_c10d_gloo 1/1 ... [2025-12-04 13:38:44.668582][5233565.647623846] 2025-12-04T13:38:44.6687106Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:38:44.6688958Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_gloo.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:38:44.668770] 2025-12-04T13:58:41.8360937Z 2025-12-04T13:58:41.8361841Z distributed/test_c10d_gloo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_gloo_1.1_598729a4fcd85d87_.log 2025-12-04T13:58:41.8407963Z Running 246 items in this shard: test/distributed/test_c10d_gloo.py::RendezvousTCPTest::test_tcp_init, test/distributed/test_c10d_gloo.py::RendezvousEnvTest::test_logging_init, test/distributed/test_c10d_gloo.py::TimeoutTest::test_default_store_timeout_gloo, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_inference_mode, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_into_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_noncontiguous_input, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_checks_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_op_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_barrier_implies_wait, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_block_current_stream_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_empty_tensors, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_noncontiguous_input, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_multi_device_constructor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_scatter, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_scatter_tensor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_send_recv_complex, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_cuda_dispatched, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_dataclass_output, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_dataclass_output_unused_param, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_cpu, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_gloo, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_register_just_once, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_sparse_gradients, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_complex_params, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_invalid_comm_hook_init, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_invalid_comm_hook_return_type, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_find_unused_parameters_when_unused_parameters_empty, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad_with_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad_with_static_graph, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_1gpu_module_device_ids_integer_list, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_1gpu_module_device_ids_torch_device_list, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_2gpu_module, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_4gpu_module, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_output, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_output_with_unused_parameters, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_sharded_tensor, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_invalid_powerSGD_state, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_save_load_checkpoint, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sync_batch_norm_empty_input, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward, test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward_optimizer, test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward_unused_parameters, test/distributed/test_c10d_gloo.py::ReducerTest::test_multi_dtype_multi_bucket, test/distributed/test_c10d_gloo.py::ReducerTest::test_multi_dtype_single_bucket, test/distributed/test_c10d_gloo.py::ReducerTest::test_single_dtype_single_bucket, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_inference_mode, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_into_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_noncontiguous_input, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_op_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_barrier_implies_wait, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_block_current_stream_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_empty_tensors, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_noncontiguous_input, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_multi_device_constructor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_send_recv_complex, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_cuda_dispatched, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_inference_mode, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_into_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_noncontiguous_input, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_checks_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_op_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_barrier_implies_wait, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_block_current_stream_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_empty_tensors, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_noncontiguous_input, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_long, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_multi_device_constructor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter_tensor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_complex, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_short_json, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_short_pickle, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_cuda_dispatched, test/distributed/test_c10d_gloo.py::CommTest::test_bool_tensors, test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cpu, test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cuda, test/distributed/test_c10d_gloo.py::CommTest::test_gloo_rank_membership, test/distributed/test_c10d_gloo.py::CommTest::test_gloo_warn_not_in_group, test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_incremented_gloo_default, test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_incremented_gloo_subgroup, test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_default_pg_gloo, test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_gloo_new_group, test/distributed/test_c10d_gloo.py::CommTest::test_tensor_dtype_complex, test/distributed/test_c10d_gloo.py::CommTest::test_tensor_dtype_mismatch, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_all_to_all_single, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_allgather_coalesced, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_allreduce_coalesced, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_collectives, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_default_process_group, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_init_process_group_optional_backend, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_monitored_barrier, test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync, test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync_duplicate_pg, test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync_sanity_check 2025-12-04T13:58:41.8439033Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::RendezvousTCPTest::test_tcp_init 2025-12-04T13:58:41.8439321Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::RendezvousEnvTest::test_logging_init 2025-12-04T13:58:41.8439648Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::TimeoutTest::test_default_store_timeout_gloo 2025-12-04T13:58:41.8439947Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_basics 2025-12-04T13:58:41.8440252Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_basics_cuda 2025-12-04T13:58:41.8440557Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_checks 2025-12-04T13:58:41.8440864Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_async 2025-12-04T13:58:41.8441186Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_checks 2025-12-04T13:58:41.8441506Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_inference_mode 2025-12-04T13:58:41.8441833Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_into_tensor_coalesced 2025-12-04T13:58:41.8442201Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_noncontiguous_input 2025-12-04T13:58:41.8442514Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress 2025-12-04T13:58:41.8442814Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress_cuda 2025-12-04T13:58:41.8443112Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics 2025-12-04T13:58:41.8443412Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics_cuda 2025-12-04T13:58:41.8443710Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_checks 2025-12-04T13:58:41.8444015Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_async 2025-12-04T13:58:41.8444352Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_basics 2025-12-04T13:58:41.8444673Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_checks 2025-12-04T13:58:41.8445005Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_checks_cuda 2025-12-04T13:58:41.8445332Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_stress 2025-12-04T13:58:41.8445662Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_op_timeout 2025-12-04T13:58:41.8445973Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_overall_timeout 2025-12-04T13:58:41.8446280Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_stress 2025-12-04T13:58:41.8446588Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_stress_cuda 2025-12-04T13:58:41.8446892Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_barrier_implies_wait 2025-12-04T13:58:41.8447217Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_block_current_stream_cuda 2025-12-04T13:58:41.8447525Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics 2025-12-04T13:58:41.8447847Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics_cuda 2025-12-04T13:58:41.8448145Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_checks 2025-12-04T13:58:41.8448440Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress 2025-12-04T13:58:41.8448746Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress_cuda 2025-12-04T13:58:41.8449044Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_empty_tensors 2025-12-04T13:58:41.8449333Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_basics 2025-12-04T13:58:41.8449702Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_basics_cuda 2025-12-04T13:58:41.8449998Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_checks 2025-12-04T13:58:41.8450303Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_noncontiguous_input 2025-12-04T13:58:41.8450606Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress 2025-12-04T13:58:41.8450918Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress_cuda 2025-12-04T13:58:41.8451223Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_multi_device_constructor 2025-12-04T13:58:41.8451523Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_basics 2025-12-04T13:58:41.8451815Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_basics_cuda 2025-12-04T13:58:41.8452108Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_checks 2025-12-04T13:58:41.8452396Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_scatter 2025-12-04T13:58:41.8452692Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_scatter_tensor 2025-12-04T13:58:41.8453037Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_scatter_tensor_coalesced 2025-12-04T13:58:41.8453346Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress 2025-12-04T13:58:41.8453635Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress_cuda 2025-12-04T13:58:41.8453928Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics 2025-12-04T13:58:41.8454220Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics_cuda 2025-12-04T13:58:41.8454514Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_checks 2025-12-04T13:58:41.8454800Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_stress 2025-12-04T13:58:41.8455094Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_stress_cuda 2025-12-04T13:58:41.8455398Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_send_recv_all_to_all 2025-12-04T13:58:41.8455697Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_send_recv_complex 2025-12-04T13:58:41.8456015Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_set_gloo_pg_timeout 2025-12-04T13:58:41.8456324Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics 2025-12-04T13:58:41.8456679Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics_cuda 2025-12-04T13:58:41.8457030Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_checks 2025-12-04T13:58:41.8457356Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_cuda_dispatched 2025-12-04T13:58:41.8457685Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_dataclass_output 2025-12-04T13:58:41.8458026Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_dataclass_output_unused_param 2025-12-04T13:58:41.8458391Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module 2025-12-04T13:58:41.8458767Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_weight_sharing 2025-12-04T13:58:41.8459166Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False 2025-12-04T13:58:41.8459557Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_True 2025-12-04T13:58:41.8460017Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_False 2025-12-04T13:58:41.8460447Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_True 2025-12-04T13:58:41.8460852Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_False 2025-12-04T13:58:41.8461243Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_True 2025-12-04T13:58:41.8461626Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_weight_sharing 2025-12-04T13:58:41.8462020Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_False 2025-12-04T13:58:41.8462453Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_True 2025-12-04T13:58:41.8462869Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_False 2025-12-04T13:58:41.8463282Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_True 2025-12-04T13:58:41.8463670Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_cpu 2025-12-04T13:58:41.8464040Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_gloo 2025-12-04T13:58:41.8464410Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_register_just_once 2025-12-04T13:58:41.8464771Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_sparse_gradients 2025-12-04T13:58:41.8465131Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_complex_params 2025-12-04T13:58:41.8465468Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_invalid_comm_hook_init 2025-12-04T13:58:41.8465844Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_invalid_comm_hook_return_type 2025-12-04T13:58:41.8466233Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_find_unused_parameters_when_unused_parameters_empty 2025-12-04T13:58:41.8466619Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad 2025-12-04T13:58:41.8467003Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad_with_grad_is_view 2025-12-04T13:58:41.8467409Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad_with_static_graph 2025-12-04T13:58:41.8467814Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_1gpu_module_device_ids_integer_list 2025-12-04T13:58:41.8468226Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_1gpu_module_device_ids_torch_device_list 2025-12-04T13:58:41.8468607Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_2gpu_module 2025-12-04T13:58:41.8468945Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_4gpu_module 2025-12-04T13:58:41.8469298Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module 2025-12-04T13:58:41.8469698Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module_grad_is_view 2025-12-04T13:58:41.8470043Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_output 2025-12-04T13:58:41.8470389Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_output_with_unused_parameters 2025-12-04T13:58:41.8470747Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_sharded_tensor 2025-12-04T13:58:41.8471085Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_invalid_powerSGD_state 2025-12-04T13:58:41.8471436Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_save_load_checkpoint 2025-12-04T13:58:41.8471755Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients 2025-12-04T13:58:41.8472092Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients_grad_is_view 2025-12-04T13:58:41.8472446Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sync_batch_norm_empty_input 2025-12-04T13:58:41.8472805Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sync_batch_norm_only_empty_input 2025-12-04T13:58:41.8473128Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward 2025-12-04T13:58:41.8473412Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward_optimizer 2025-12-04T13:58:41.8473717Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward_unused_parameters 2025-12-04T13:58:41.8474016Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_multi_dtype_multi_bucket 2025-12-04T13:58:41.8474322Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_multi_dtype_single_bucket 2025-12-04T13:58:41.8474614Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_single_dtype_single_bucket 2025-12-04T13:58:41.8474937Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_basics 2025-12-04T13:58:41.8475266Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_basics_cuda 2025-12-04T13:58:41.8475594Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_checks 2025-12-04T13:58:41.8475928Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_coalesced_async 2025-12-04T13:58:41.8476277Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_coalesced_checks 2025-12-04T13:58:41.8476625Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_inference_mode 2025-12-04T13:58:41.8476979Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_into_tensor_coalesced 2025-12-04T13:58:41.8477344Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_noncontiguous_input 2025-12-04T13:58:41.8477682Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_stress 2025-12-04T13:58:41.8478007Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_stress_cuda 2025-12-04T13:58:41.8478353Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_basics 2025-12-04T13:58:41.8478676Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_basics_cuda 2025-12-04T13:58:41.8478999Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_checks 2025-12-04T13:58:41.8479330Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_async 2025-12-04T13:58:41.8479718Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_basics 2025-12-04T13:58:41.8480069Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks 2025-12-04T13:58:41.8480449Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks_cuda 2025-12-04T13:58:41.8480807Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_stress 2025-12-04T13:58:41.8481145Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_op_timeout 2025-12-04T13:58:41.8481483Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_overall_timeout 2025-12-04T13:58:41.8481817Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress 2025-12-04T13:58:41.8482143Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress_cuda 2025-12-04T13:58:41.8482475Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_barrier_implies_wait 2025-12-04T13:58:41.8482814Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_block_current_stream_cuda 2025-12-04T13:58:41.8483147Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_basics 2025-12-04T13:58:41.8483492Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_basics_cuda 2025-12-04T13:58:41.8483817Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_checks 2025-12-04T13:58:41.8484148Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress 2025-12-04T13:58:41.8484473Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress_cuda 2025-12-04T13:58:41.8484794Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_empty_tensors 2025-12-04T13:58:41.8485106Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_basics 2025-12-04T13:58:41.8485423Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_basics_cuda 2025-12-04T13:58:41.8485744Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_checks 2025-12-04T13:58:41.8486081Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_noncontiguous_input 2025-12-04T13:58:41.8486411Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_stress 2025-12-04T13:58:41.8486730Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_stress_cuda 2025-12-04T13:58:41.8487080Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_multi_device_constructor 2025-12-04T13:58:41.8487408Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_basics 2025-12-04T13:58:41.8487726Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_basics_cuda 2025-12-04T13:58:41.8488045Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_checks 2025-12-04T13:58:41.8488363Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter 2025-12-04T13:58:41.8488689Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor 2025-12-04T13:58:41.8489038Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor_coalesced 2025-12-04T13:58:41.8489397Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_stress 2025-12-04T13:58:41.8489748Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_stress_cuda 2025-12-04T13:58:41.8490067Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_basics 2025-12-04T13:58:41.8490389Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_basics_cuda 2025-12-04T13:58:41.8490712Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_checks 2025-12-04T13:58:41.8491023Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_stress 2025-12-04T13:58:41.8491341Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_stress_cuda 2025-12-04T13:58:41.8491671Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_send_recv_all_to_all 2025-12-04T13:58:41.8491999Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_send_recv_complex 2025-12-04T13:58:41.8492342Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_set_gloo_pg_timeout 2025-12-04T13:58:41.8492679Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_basics 2025-12-04T13:58:41.8493048Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_basics_cuda 2025-12-04T13:58:41.8493395Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_checks 2025-12-04T13:58:41.8493751Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_cuda_dispatched 2025-12-04T13:58:41.8494086Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics 2025-12-04T13:58:41.8494391Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics_cuda 2025-12-04T13:58:41.8494699Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_checks 2025-12-04T13:58:41.8495010Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_async 2025-12-04T13:58:41.8495337Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_checks 2025-12-04T13:58:41.8495662Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_inference_mode 2025-12-04T13:58:41.8495994Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_into_tensor_coalesced 2025-12-04T13:58:41.8496350Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_noncontiguous_input 2025-12-04T13:58:41.8496669Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress 2025-12-04T13:58:41.8496974Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress_cuda 2025-12-04T13:58:41.8497299Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_basics 2025-12-04T13:58:41.8497607Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_basics_cuda 2025-12-04T13:58:41.8497912Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_checks 2025-12-04T13:58:41.8498226Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_async 2025-12-04T13:58:41.8498568Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_basics 2025-12-04T13:58:41.8498894Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_checks 2025-12-04T13:58:41.8499229Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_checks_cuda 2025-12-04T13:58:41.8499561Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_stress 2025-12-04T13:58:41.8499910Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_op_timeout 2025-12-04T13:58:41.8500226Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_overall_timeout 2025-12-04T13:58:41.8500537Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_stress 2025-12-04T13:58:41.8500845Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_stress_cuda 2025-12-04T13:58:41.8501155Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_barrier_implies_wait 2025-12-04T13:58:41.8501489Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_block_current_stream_cuda 2025-12-04T13:58:41.8501799Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_basics 2025-12-04T13:58:41.8502122Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_basics_cuda 2025-12-04T13:58:41.8502425Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_checks 2025-12-04T13:58:41.8502721Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress 2025-12-04T13:58:41.8503026Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress_cuda 2025-12-04T13:58:41.8503329Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_empty_tensors 2025-12-04T13:58:41.8503623Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_basics 2025-12-04T13:58:41.8503920Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_basics_cuda 2025-12-04T13:58:41.8504217Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_checks 2025-12-04T13:58:41.8504524Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_noncontiguous_input 2025-12-04T13:58:41.8504834Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_stress 2025-12-04T13:58:41.8505147Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_stress_cuda 2025-12-04T13:58:41.8505435Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_long 2025-12-04T13:58:41.8505731Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_multi_device_constructor 2025-12-04T13:58:41.8506034Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_basics 2025-12-04T13:58:41.8506331Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_basics_cuda 2025-12-04T13:58:41.8506626Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_checks 2025-12-04T13:58:41.8506917Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter 2025-12-04T13:58:41.8507235Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter_tensor 2025-12-04T13:58:41.8507560Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter_tensor_coalesced 2025-12-04T13:58:41.8507874Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_stress 2025-12-04T13:58:41.8508166Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_stress_cuda 2025-12-04T13:58:41.8508461Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_basics 2025-12-04T13:58:41.8508759Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_basics_cuda 2025-12-04T13:58:41.8509056Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_checks 2025-12-04T13:58:41.8509351Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_stress 2025-12-04T13:58:41.8509692Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_stress_cuda 2025-12-04T13:58:41.8510015Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_all_to_all 2025-12-04T13:58:41.8510320Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_complex 2025-12-04T13:58:41.8510637Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_set_gloo_pg_timeout 2025-12-04T13:58:41.8510935Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_short_json 2025-12-04T13:58:41.8511221Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_short_pickle 2025-12-04T13:58:41.8511524Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_basics 2025-12-04T13:58:41.8511849Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_basics_cuda 2025-12-04T13:58:41.8512173Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_checks 2025-12-04T13:58:41.8512504Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_cuda_dispatched 2025-12-04T13:58:41.8512806Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_bool_tensors 2025-12-04T13:58:41.8513079Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cpu 2025-12-04T13:58:41.8513367Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cuda 2025-12-04T13:58:41.8513667Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_gloo_rank_membership 2025-12-04T13:58:41.8513942Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_gloo_warn_not_in_group 2025-12-04T13:58:41.8514235Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_incremented_gloo_default 2025-12-04T13:58:41.8514547Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_incremented_gloo_subgroup 2025-12-04T13:58:41.8514852Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_default_pg_gloo 2025-12-04T13:58:41.8515149Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_gloo_new_group 2025-12-04T13:58:41.8515432Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_tensor_dtype_complex 2025-12-04T13:58:41.8515707Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_tensor_dtype_mismatch 2025-12-04T13:58:41.8516059Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_all_to_all_single 2025-12-04T13:58:41.8516452Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_allgather_coalesced 2025-12-04T13:58:41.8516847Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_allreduce_coalesced 2025-12-04T13:58:41.8517230Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_collectives 2025-12-04T13:58:41.8517619Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_default_process_group 2025-12-04T13:58:41.8518046Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends 2025-12-04T13:58:41.8518496Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_init_process_group_optional_backend 2025-12-04T13:58:41.8518915Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_monitored_barrier 2025-12-04T13:58:41.8519267Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync 2025-12-04T13:58:41.8519615Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync_duplicate_pg 2025-12-04T13:58:41.8520103Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync_sanity_check 2025-12-04T13:58:41.8520278Z 2025-12-04T13:58:41.8520397Z Finished distributed/test_c10d_gloo 1/1 ... [2025-12-04 13:58:41.840082][5234762.819108178], took 19.95min 2025-12-04T13:58:41.8520819Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:58:41.8521210Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:58:41.8521424Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading 2025-12-04T13:58:41.8521603Z Uploading artifacts took 0.00 seconds 2025-12-04T13:58:41.8521792Z Running distributed/test_c10d_ops_nccl 1/1 ... [2025-12-04 13:58:41.843827][5234762.822868377] 2025-12-04T13:58:41.8521984Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:58:41.8522383Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/test_c10d_ops_nccl.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:58:41.844047] 2025-12-04T13:58:51.1730537Z 2025-12-04T13:58:51.1731447Z distributed/test_c10d_ops_nccl 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_ops_nccl_1.1_9bb7c62b01c00575_.log 2025-12-04T13:58:51.1739957Z Running 30 items in this shard: test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_all_gather_v, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_allgather_base_basics, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_allgather_base_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_allgather_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_allreduce_float8, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_allreduce_in_cudagraph, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_allreduce_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_alltoall_ops_with_cudafree_race, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_barrier, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_broadcast_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_empty_tensors, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_gather_checks, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_gather_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_gather_stress, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_nccl_watchdog_cudagraph, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_reduce_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_reduce_scatter_base_basics, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_reduce_scatter_base_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_reduce_scatter_bfloat16, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_reduce_scatter_float8, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_reduce_scatter_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_reduce_scatter_v, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_scatter_checks, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_scatter_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_scatter_stress, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_send_recv, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_send_recv_complex, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_send_recv_object_list, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_sparse_allreduce_ops, test/distributed/test_c10d_ops_nccl.py::ProcessGroupNCCLOpTest::test_tensor_register_hook 2025-12-04T13:58:51.1746056Z 2025-12-04T13:58:51.1746239Z Finished distributed/test_c10d_ops_nccl 1/1 ... [2025-12-04 13:58:51.172720][5234772.15175705], took 0.16min 2025-12-04T13:58:51.1755175Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:58:51.1766009Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:58:51.1767555Z Running distributed/elastic/events/lib_test 1/1 ... [2025-12-04 13:58:51.176646][5234772.155687806] 2025-12-04T13:58:51.1767799Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:58:51.1769517Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/events/lib_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:58:51.176847] 2025-12-04T13:58:53.3944505Z 2025-12-04T13:58:53.3945342Z distributed/elastic/events/lib_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.events.lib_test_1.1_3edaaf21764b7565_.log 2025-12-04T13:58:53.3949352Z Running 8 items in this shard: test/distributed/elastic/events/lib_test.py::EventLibTest::test_event_created, test/distributed/elastic/events/lib_test.py::EventLibTest::test_event_deser, test/distributed/elastic/events/lib_test.py::EventLibTest::test_get_or_create_logger, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_construct_and_record_rdzv_event, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_construct_and_record_rdzv_event_does_not_run_if_invalid_dest, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_created, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_deserialize, test/distributed/elastic/events/lib_test.py::RdzvEventLibTest::test_rdzv_event_str 2025-12-04T13:58:53.3952375Z 2025-12-04T13:58:53.3952640Z Finished distributed/elastic/events/lib_test 1/1 ... [2025-12-04 13:58:53.394178][5234774.373215253], took 0.04min 2025-12-04T13:58:53.3969920Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:58:53.3981677Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:58:53.3982219Z Running distributed/elastic/metrics/api_test 1/1 ... [2025-12-04 13:58:53.398108][5234774.377149679] 2025-12-04T13:58:53.3982566Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:58:53.3984542Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/metrics/api_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:58:53.398304] 2025-12-04T13:58:55.8167445Z 2025-12-04T13:58:55.8168747Z distributed/elastic/metrics/api_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.metrics.api_test_1.1_d08f7f2dea080f69_.log 2025-12-04T13:58:55.8170891Z Running 3 items in this shard: test/distributed/elastic/metrics/api_test.py::MetricsApiTest::test_get_metric_name, test/distributed/elastic/metrics/api_test.py::MetricsApiTest::test_inheritance, test/distributed/elastic/metrics/api_test.py::MetricsApiTest::test_profile 2025-12-04T13:58:55.8172000Z 2025-12-04T13:58:55.8172637Z Finished distributed/elastic/metrics/api_test 1/1 ... [2025-12-04 13:58:55.816353][5234776.795389625], took 0.04min 2025-12-04T13:58:55.8192721Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:58:55.8204376Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:58:55.8204717Z Running distributed/elastic/multiprocessing/api_test 1/1 ... [2025-12-04 13:58:55.820363][5234776.79940358] 2025-12-04T13:58:55.8204973Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:58:55.8207620Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/multiprocessing/api_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:58:55.820558] 2025-12-04T13:59:16.8696373Z 2025-12-04T13:59:16.8697434Z distributed/elastic/multiprocessing/api_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.multiprocessing.api_test_1.1_293079c1c785d4d6_.log 2025-12-04T13:59:16.8707802Z Running 26 items in this shard: test/distributed/elastic/multiprocessing/api_test.py::RunProcResultsTest::test_get_failures, test/distributed/elastic/multiprocessing/api_test.py::RunProcResultsTest::test_is_failed, test/distributed/elastic/multiprocessing/api_test.py::StdTest::test_from_str_bad_input, test/distributed/elastic/multiprocessing/api_test.py::StdTest::test_from_value, test/distributed/elastic/multiprocessing/api_test.py::StdTest::test_from_value_map, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_args_env_len_mismatch, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_function_large_ret_val, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_function_raise, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_function_with_tensor, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_invalid_log_dir, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_multiprocess_context_close, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_multiprocessing_context_poll_raises_exception, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_pcontext_wait, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_pcontext_wait_on_a_child_thread, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_to_map, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_void_function, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsFuncTest::test_wait_for_all_child_procs_to_exit, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_binary_exit, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_binary_incorrect_entrypoint, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_binary_raises, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_subprocess_context_close, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesAsBinaryTest::test_validate_full_rank, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesListAsFuncTest::test_function, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesListAsBinaryTest::test_binary, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesListAsBinaryTest::test_binary_duplicate_log_filters, test/distributed/elastic/multiprocessing/api_test.py::StartProcessesListAsBinaryTest::test_binary_redirect_and_tee 2025-12-04T13:59:16.8715030Z 2025-12-04T13:59:16.8715329Z Finished distributed/elastic/multiprocessing/api_test 1/1 ... [2025-12-04 13:59:16.869352][5234797.848389116], took 0.35min 2025-12-04T13:59:16.8721300Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:59:16.8730008Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:59:16.8732686Z Running distributed/elastic/timer/local_timer_example 1/1 ... [2025-12-04 13:59:16.873140][5234797.852181664] 2025-12-04T13:59:16.8732954Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:59:16.8735047Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/timer/local_timer_example.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:59:16.873337] 2025-12-04T13:59:27.4558026Z 2025-12-04T13:59:27.4559237Z distributed/elastic/timer/local_timer_example 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.timer.local_timer_example_1.1_1a8baab1bb8d84b4_.log 2025-12-04T13:59:27.4561010Z Running 2 items in this shard: test/distributed/elastic/timer/local_timer_example.py::LocalTimerExample::test_example_start_method_spawn, test/distributed/elastic/timer/local_timer_example.py::LocalTimerExample::test_torch_mp_example 2025-12-04T13:59:27.4561871Z 2025-12-04T13:59:27.4562236Z Finished distributed/elastic/timer/local_timer_example 1/1 ... [2025-12-04 13:59:27.455524][5234808.434560796], took 0.18min 2025-12-04T13:59:27.4585901Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:59:27.4596016Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:59:27.4598464Z Running distributed/elastic/timer/local_timer_test 1/1 ... [2025-12-04 13:59:27.459753][5234808.438794379] 2025-12-04T13:59:27.4598844Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:59:27.4600878Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/timer/local_timer_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:59:27.459965] 2025-12-04T13:59:33.4336025Z 2025-12-04T13:59:33.4337779Z distributed/elastic/timer/local_timer_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.timer.local_timer_test_1.1_3f18f0368c8ef813_.log 2025-12-04T13:59:33.4343258Z Running 14 items in this shard: test/distributed/elastic/timer/local_timer_test.py::LocalTimerTest::test_client_interaction, test/distributed/elastic/timer/local_timer_test.py::LocalTimerTest::test_exception_propagation, test/distributed/elastic/timer/local_timer_test.py::LocalTimerTest::test_get_timer_recursive, test/distributed/elastic/timer/local_timer_test.py::LocalTimerTest::test_happy_path, test/distributed/elastic/timer/local_timer_test.py::LocalTimerTest::test_no_client, test/distributed/elastic/timer/local_timer_test.py::LocalTimerTest::test_timer, test/distributed/elastic/timer/local_timer_test.py::MultiprocessingRequestQueueTest::test_get, test/distributed/elastic/timer/local_timer_test.py::MultiprocessingRequestQueueTest::test_get_less_than_size, test/distributed/elastic/timer/local_timer_test.py::MultiprocessingRequestQueueTest::test_get_size, test/distributed/elastic/timer/local_timer_test.py::LocalTimerServerTest::test_acquire_release, test/distributed/elastic/timer/local_timer_test.py::LocalTimerServerTest::test_expired_timers, test/distributed/elastic/timer/local_timer_test.py::LocalTimerServerTest::test_valid_timers, test/distributed/elastic/timer/local_timer_test.py::LocalTimerServerTest::test_watchdog_call_count, test/distributed/elastic/timer/local_timer_test.py::LocalTimerServerTest::test_watchdog_empty_queue 2025-12-04T13:59:33.4347709Z 2025-12-04T13:59:33.4348030Z Finished distributed/elastic/timer/local_timer_test 1/1 ... [2025-12-04 13:59:33.433203][5234814.412239534], took 0.10min 2025-12-04T13:59:33.4361804Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:59:33.4373747Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:59:33.4374268Z Running distributed/elastic/utils/distributed_test 1/1 ... [2025-12-04 13:59:33.437320][5234814.416361747] 2025-12-04T13:59:33.4374576Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:59:33.4376639Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/utils/distributed_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:59:33.437520] 2025-12-04T13:59:39.0098885Z 2025-12-04T13:59:39.0100322Z distributed/elastic/utils/distributed_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.utils.distributed_test_1.1_01966c024defed56_.log 2025-12-04T13:59:39.0104728Z Running 8 items in this shard: test/distributed/elastic/utils/distributed_test.py::DistributedUtilTest::test_create_store_multi, test/distributed/elastic/utils/distributed_test.py::DistributedUtilTest::test_create_store_no_port_multi, test/distributed/elastic/utils/distributed_test.py::DistributedUtilTest::test_create_store_single_server, test/distributed/elastic/utils/distributed_test.py::DistributedUtilTest::test_create_store_timeout_on_server, test/distributed/elastic/utils/distributed_test.py::DistributedUtilTest::test_create_store_timeout_on_worker, test/distributed/elastic/utils/distributed_test.py::DistributedUtilTest::test_create_store_with_libuv_support, test/distributed/elastic/utils/distributed_test.py::DistributedUtilTest::test_port_already_in_use_on_server, test/distributed/elastic/utils/distributed_test.py::DistributedUtilTest::test_port_already_in_use_on_worker 2025-12-04T13:59:39.0108819Z 2025-12-04T13:59:39.0109251Z Finished distributed/elastic/utils/distributed_test 1/1 ... [2025-12-04 13:59:39.009647][5234819.988682802], took 0.09min 2025-12-04T13:59:39.0125911Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:59:39.0136512Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:59:39.0138685Z Running distributed/elastic/utils/logging_test 1/1 ... [2025-12-04 13:59:39.013777][5234819.992818205] 2025-12-04T13:59:39.0139001Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:59:39.0140969Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/utils/logging_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:59:39.013972] 2025-12-04T13:59:41.2320099Z 2025-12-04T13:59:41.2321094Z distributed/elastic/utils/logging_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.utils.logging_test_1.1_9edefec145f4d39d_.log 2025-12-04T13:59:41.2322445Z Running 2 items in this shard: test/distributed/elastic/utils/logging_test.py::LoggingTest::test_derive_module_name, test/distributed/elastic/utils/logging_test.py::LoggingTest::test_logger_name 2025-12-04T13:59:41.2323158Z 2025-12-04T13:59:41.2323477Z Finished distributed/elastic/utils/logging_test 1/1 ... [2025-12-04 13:59:41.231685][5234822.210722943], took 0.04min 2025-12-04T13:59:41.2346910Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:59:41.2356165Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:59:41.2358451Z Running distributed/elastic/utils/util_test 1/1 ... [2025-12-04 13:59:41.235753][5234822.214794317] 2025-12-04T13:59:41.2358840Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set 2025-12-04T13:59:41.2362847Z Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'distributed/elastic/utils/util_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:59:41.235945] 2025-12-04T13:59:43.6040070Z 2025-12-04T13:59:43.6040921Z distributed/elastic/utils/util_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.utils.util_test_1.1_718f6bc0a60f45bf_.log 2025-12-04T13:59:43.6045587Z Running 12 items in this shard: test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_barrier, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_barrier_hash_store, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_barrier_timeout_operations, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_barrier_timeout_rank_tracing, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_get_all_rank_0, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_get_all_rank_n, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_synchronize, test/distributed/elastic/utils/util_test.py::StoreUtilTest::test_synchronize_hash_store, test/distributed/elastic/utils/util_test.py::UtilTest::test_get_logger, test/distributed/elastic/utils/util_test.py::UtilTest::test_get_logger_custom_name, test/distributed/elastic/utils/util_test.py::UtilTest::test_get_logger_different, test/distributed/elastic/utils/util_test.py::UtilTest::test_get_logger_none 2025-12-04T13:59:43.6056965Z 2025-12-04T13:59:43.6057220Z Finished distributed/elastic/utils/util_test 1/1 ... [2025-12-04 13:59:43.603669][5234824.582707756], took 0.04min 2025-12-04T13:59:43.6060607Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_inductor_collectives/distributed.test_inductor_collectives-5592dd52db052605.xml 2025-12-04T13:59:43.6071788Z Failed to parse and upload json test reports: Unable to locate credentials 2025-12-04T13:59:45.7609554Z Running test batch 'tests to run' cost 12328.11 seconds 2025-12-04T13:59:45.7613512Z Emitting td_test_failure_stats_v2 2025-12-04T13:59:45.7617922Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856785_7d73d336d11911f0aa330a4c12374f04 2025-12-04T13:59:47.7791004Z /var/lib/jenkins/pytorch/tools/stats/upload_metrics.py:156: UserWarning: Error uploading metric td_test_failure_stats_v2 to DynamoDB: Unable to locate credentials 2025-12-04T13:59:47.7791848Z warn(f"Error uploading metric {metric_name} to DynamoDB: {e}") 2025-12-04T13:59:47.7792260Z Emitting td_test_failure_stats_v2 2025-12-04T13:59:47.7795720Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856787_7ea7b196d11911f0aa330a4c12374f04 2025-12-04T13:59:47.7811746Z Emitting td_test_failure_stats_v2 2025-12-04T13:59:47.7812266Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856787_7ea7f98ad11911f0aa330a4c12374f04 2025-12-04T13:59:47.7829318Z Emitting td_test_failure_stats_v2 2025-12-04T13:59:47.7829901Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856787_7ea83fc6d11911f0aa330a4c12374f04 2025-12-04T13:59:47.7847202Z Emitting td_test_failure_stats_v2 2025-12-04T13:59:47.7848523Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856787_7ea88602d11911f0aa330a4c12374f04 2025-12-04T13:59:47.7865030Z Emitting td_test_failure_stats_v2 2025-12-04T13:59:47.7865781Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856787_7ea8c9f0d11911f0aa330a4c12374f04 2025-12-04T13:59:47.7883044Z Emitting td_test_failure_stats_v2 2025-12-04T13:59:47.7883557Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856787_7ea91068d11911f0aa330a4c12374f04 2025-12-04T13:59:47.7900623Z Emitting td_test_failure_stats_v2 2025-12-04T13:59:47.7901026Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856787_7ea956aed11911f0aa330a4c12374f04 2025-12-04T13:59:47.7917351Z distributed/fsdp/test_fsdp_overlap 1/1 failed! 2025-12-04T13:59:47.7917598Z distributed/fsdp/test_fsdp_exec_order 1/1 failed! 2025-12-04T13:59:47.7917813Z distributed/fsdp/test_fsdp_input 1/1 failed! 2025-12-04T13:59:47.7918017Z distributed/fsdp/test_fsdp_traversal 1/1 failed! 2025-12-04T13:59:47.7918226Z distributed/fsdp/test_fsdp_checkpoint 1/1 failed! 2025-12-04T13:59:47.7918433Z distributed/fsdp/test_fsdp_fine_tune 1/1 failed! 2025-12-04T13:59:47.7918649Z distributed/fsdp/test_hsdp_dtensor_state_dict 1/1 failed! 2025-12-04T13:59:47.7918863Z distributed/fsdp/test_fsdp_core 1/2 failed! 2025-12-04T13:59:48.5409329Z 2025-12-04T13:59:48.5409812Z real 205m33.994s 2025-12-04T13:59:48.5410140Z user 1036m3.585s 2025-12-04T13:59:48.5410384Z sys 423m46.480s 2025-12-04T13:59:48.5410635Z + sccache_epilogue 2025-12-04T13:59:48.5410961Z + echo '::group::Sccache Compilation Log' 2025-12-04T13:59:48.5411712Z ##[group]Sccache Compilation Log 2025-12-04T13:59:48.5412111Z + echo '=================== sccache compilation log ===================' 2025-12-04T13:59:48.5413017Z =================== sccache compilation log =================== 2025-12-04T13:59:48.5413649Z + python /var/lib/jenkins/pytorch/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log 2025-12-04T13:59:48.5487949Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ===========' 2025-12-04T13:59:48.5488444Z =========== If your build fails, please take a look at the log above for possible reasons =========== 2025-12-04T13:59:48.5488790Z + sccache --show-stats 2025-12-04T13:59:48.5512685Z Compile requests 339 2025-12-04T13:59:48.5512943Z Compile requests executed 0 2025-12-04T13:59:48.5513155Z Cache hits 0 2025-12-04T13:59:48.5513354Z Cache misses 0 2025-12-04T13:59:48.5513558Z Cache hits rate - 2025-12-04T13:59:48.5513764Z Cache timeouts 0 2025-12-04T13:59:48.5513967Z Cache read errors 0 2025-12-04T13:59:48.5514165Z Forced recaches 0 2025-12-04T13:59:48.5514480Z Cache write errors 0 2025-12-04T13:59:48.5514679Z Cache errors 0 2025-12-04T13:59:48.5514879Z Compilations 0 2025-12-04T13:59:48.5515087Z Compilation failures 0 2025-12-04T13:59:48.5515306Z Non-cacheable compilations 0 2025-12-04T13:59:48.5515512Z Non-cacheable calls 0 2025-12-04T13:59:48.5515719Z Non-compilation calls 339 2025-12-04T13:59:48.5515937Z Unsupported compiler calls 0 2025-12-04T13:59:48.5516156Z Average cache write 0.000 s 2025-12-04T13:59:48.5516375Z Average compiler 0.000 s 2025-12-04T13:59:48.5516593Z Average cache read hit 0.000 s 2025-12-04T13:59:48.5516811Z Failed distributed compilations 0 2025-12-04T13:59:48.5517082Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-12-04T13:59:48.5517373Z Use direct/preprocessor mode? yes 2025-12-04T13:59:48.5517592Z Version (client) 0.10.0 2025-12-04T13:59:48.5517805Z Max cache size 10 GiB 2025-12-04T13:59:48.5518032Z + sccache --stop-server 2025-12-04T13:59:48.5536247Z Stopping sccache server... 2025-12-04T13:59:48.5539316Z Compile requests 339 2025-12-04T13:59:48.5539627Z Compile requests executed 0 2025-12-04T13:59:48.5539809Z Cache hits 0 2025-12-04T13:59:48.5539989Z Cache misses 0 2025-12-04T13:59:48.5540253Z Cache hits rate - 2025-12-04T13:59:48.5540421Z Cache timeouts 0 2025-12-04T13:59:48.5540580Z Cache read errors 0 2025-12-04T13:59:48.5540744Z Forced recaches 0 2025-12-04T13:59:48.5540908Z Cache write errors 0 2025-12-04T13:59:48.5541074Z Cache errors 0 2025-12-04T13:59:48.5541242Z Compilations 0 2025-12-04T13:59:48.5541410Z Compilation failures 0 2025-12-04T13:59:48.5541590Z Non-cacheable compilations 0 2025-12-04T13:59:48.5541762Z Non-cacheable calls 0 2025-12-04T13:59:48.5541928Z Non-compilation calls 339 2025-12-04T13:59:48.5542107Z Unsupported compiler calls 0 2025-12-04T13:59:48.5542283Z Average cache write 0.000 s 2025-12-04T13:59:48.5542472Z Average compiler 0.000 s 2025-12-04T13:59:48.5542648Z Average cache read hit 0.000 s 2025-12-04T13:59:48.5542835Z Failed distributed compilations 0 2025-12-04T13:59:48.5543058Z Cache location Local disk: "/var/lib/jenkins/.cache/sccache" 2025-12-04T13:59:48.5543297Z Use direct/preprocessor mode? yes 2025-12-04T13:59:48.5543471Z Version (client) 0.10.0 2025-12-04T13:59:48.5543652Z Max cache size 10 GiB 2025-12-04T13:59:48.5543864Z + echo ::endgroup:: 2025-12-04T13:59:48.5544118Z ##[endgroup] 2025-12-04T13:59:48.5614280Z ##[error]Process completed with exit code 1. 2025-12-04T13:59:48.5640708Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-12-04T13:59:48.5641019Z # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct 2025-12-04T13:59:48.5641390Z docker exec -t "8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test" 2025-12-04T13:59:48.5645562Z shell: /usr/bin/bash -e {0} 2025-12-04T13:59:48.5645675Z env: 2025-12-04T13:59:48.5645768Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:48.5645903Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:48.5646081Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:48.5646245Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:48.5646767Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:48.5647255Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:48.5647370Z AWS_REGION: us-east-1 2025-12-04T13:59:48.5647524Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:48.5647676Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:48.5649664Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:48.5649866Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:48.5650060Z ##[endgroup] 2025-12-04T13:59:48.6362141Z ##[group]Run docker exec -t "8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c" sh -c "sudo chown -R 1001:1001 test" 2025-12-04T13:59:48.6362531Z docker exec -t "8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c" sh -c "sudo chown -R 1001:1001 test" 2025-12-04T13:59:48.6366811Z shell: /usr/bin/bash -e {0} 2025-12-04T13:59:48.6366922Z env: 2025-12-04T13:59:48.6367016Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:48.6367151Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:48.6367323Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:48.6367486Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:48.6367998Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:48.6368557Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:48.6368671Z AWS_REGION: us-east-1 2025-12-04T13:59:48.6368829Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:48.6368984Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:48.6371025Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:48.6371192Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:48.6371371Z ##[endgroup] 2025-12-04T13:59:48.7167391Z ##[group]Run cat test/**/*_toprint.log || true 2025-12-04T13:59:48.7167552Z cat test/**/*_toprint.log || true 2025-12-04T13:59:48.7171670Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0} 2025-12-04T13:59:48.7171816Z env: 2025-12-04T13:59:48.7171910Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:48.7172051Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:48.7172224Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:48.7172389Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:48.7172893Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:48.7173433Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:48.7173559Z AWS_REGION: us-east-1 2025-12-04T13:59:48.7173715Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:48.7173864Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:48.7175830Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:48.7175995Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:48.7176175Z ##[endgroup] 2025-12-04T13:59:48.7220921Z cat: 'test/**/*_toprint.log': No such file or directory 2025-12-04T13:59:48.7283043Z Prepare all required actions 2025-12-04T13:59:48.7283434Z Getting action download info 2025-12-04T13:59:49.0496881Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a) 2025-12-04T13:59:49.9118064Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02) 2025-12-04T13:59:50.8288005Z ##[group]Run ./.github/actions/upload-test-artifacts 2025-12-04T13:59:50.8288159Z with: 2025-12-04T13:59:50.8288249Z use-gha: true 2025-12-04T13:59:50.8288415Z file-suffix: test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187 2025-12-04T13:59:50.8288602Z s3-bucket: gha-artifacts 2025-12-04T13:59:50.8288711Z env: 2025-12-04T13:59:50.8288803Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:50.8288937Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:50.8289116Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:50.8289319Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:50.8289888Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:50.8290389Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:50.8290511Z AWS_REGION: us-east-1 2025-12-04T13:59:50.8290679Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:50.8290834Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:50.8292807Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:50.8292982Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:50.8293165Z ##[endgroup] 2025-12-04T13:59:50.8325831Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:59:50.8325971Z with: 2025-12-04T13:59:50.8326271Z name: test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip 2025-12-04T13:59:50.8326486Z retention-days: 14 2025-12-04T13:59:50.8326593Z if-no-files-found: warn 2025-12-04T13:59:50.8326701Z path: test/**/*.json 2025-12-04T13:59:50.8326804Z compression-level: 6 2025-12-04T13:59:50.8326905Z overwrite: false 2025-12-04T13:59:50.8327007Z include-hidden-files: false 2025-12-04T13:59:50.8327117Z env: 2025-12-04T13:59:50.8327208Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:50.8327341Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:50.8327514Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:50.8327676Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:50.8328185Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:50.8328674Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:50.8328793Z AWS_REGION: us-east-1 2025-12-04T13:59:50.8328944Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:50.8329098Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:50.8331142Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:50.8331314Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:50.8331564Z ##[endgroup] 2025-12-04T13:59:51.2088861Z With the provided path, there will be 6 files uploaded 2025-12-04T13:59:51.2092246Z Artifact name is valid! 2025-12-04T13:59:51.2093290Z Root directory input is valid! 2025-12-04T13:59:51.4270538Z Beginning upload of artifact content to blob storage 2025-12-04T13:59:51.8176285Z Uploaded bytes 44615 2025-12-04T13:59:51.8897791Z Finished uploading artifact content to blob storage! 2025-12-04T13:59:51.8898965Z SHA256 digest of uploaded artifact zip is 029366cfb8163f844ae937e8b0a3b01a795d0e959338e1e1fb83bf5154f763ec 2025-12-04T13:59:51.8900179Z Finalizing artifact upload 2025-12-04T13:59:52.1027253Z Artifact test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip.zip successfully finalized. Artifact ID 4764837803 2025-12-04T13:59:52.1028677Z Artifact test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip has been successfully uploaded! Final size is 44615 bytes. Artifact ID is 4764837803 2025-12-04T13:59:52.1033096Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4764837803 2025-12-04T13:59:52.1165184Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:59:52.1165326Z with: 2025-12-04T13:59:52.1165533Z name: test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip 2025-12-04T13:59:52.1165765Z retention-days: 14 2025-12-04T13:59:52.1165877Z if-no-files-found: ignore 2025-12-04T13:59:52.1166002Z path: test/**/*.xml test/**/*.csv 2025-12-04T13:59:52.1166139Z compression-level: 6 2025-12-04T13:59:52.1166246Z overwrite: false 2025-12-04T13:59:52.1166351Z include-hidden-files: false 2025-12-04T13:59:52.1166468Z env: 2025-12-04T13:59:52.1166560Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:52.1166705Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:52.1166889Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:52.1167069Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:52.1167584Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:52.1168086Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:52.1168207Z AWS_REGION: us-east-1 2025-12-04T13:59:52.1168367Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:52.1168602Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:52.1170664Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:52.1170841Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:52.1171027Z ##[endgroup] 2025-12-04T13:59:52.5433970Z With the provided path, there will be 859 files uploaded 2025-12-04T13:59:52.5436665Z Artifact name is valid! 2025-12-04T13:59:52.5437405Z Root directory input is valid! 2025-12-04T13:59:52.7705680Z Beginning upload of artifact content to blob storage 2025-12-04T13:59:53.4919768Z Uploaded bytes 712772 2025-12-04T13:59:53.5599758Z Finished uploading artifact content to blob storage! 2025-12-04T13:59:53.5601027Z SHA256 digest of uploaded artifact zip is f5d9a4191bc68805afcc704a5c344589ccdcd6c6d18f2c02d81b5874280de6f7 2025-12-04T13:59:53.5601670Z Finalizing artifact upload 2025-12-04T13:59:54.0288743Z Artifact test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip.zip successfully finalized. Artifact ID 4764838140 2025-12-04T13:59:54.0290321Z Artifact test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip has been successfully uploaded! Final size is 712772 bytes. Artifact ID is 4764838140 2025-12-04T13:59:54.0295959Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4764838140 2025-12-04T13:59:54.0439861Z ##[group]Run actions/upload-artifact@v4 2025-12-04T13:59:54.0440130Z with: 2025-12-04T13:59:54.0440368Z name: logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip 2025-12-04T13:59:54.0440638Z retention-days: 14 2025-12-04T13:59:54.0440783Z if-no-files-found: ignore 2025-12-04T13:59:54.0440940Z path: usage_log.txt test/**/*.log 2025-12-04T13:59:54.0441107Z compression-level: 6 2025-12-04T13:59:54.0441243Z overwrite: false 2025-12-04T13:59:54.0441384Z include-hidden-files: false 2025-12-04T13:59:54.0441530Z env: 2025-12-04T13:59:54.0441648Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:54.0441835Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:54.0449504Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:54.0449961Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:54.0450499Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:54.0451089Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:54.0451213Z AWS_REGION: us-east-1 2025-12-04T13:59:54.0451382Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:54.0451552Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:54.0453593Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:54.0453775Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:54.0453965Z ##[endgroup] 2025-12-04T13:59:54.4438573Z Multiple search paths detected. Calculating the least common ancestor of all paths 2025-12-04T13:59:54.4439512Z The least common ancestor is /home/runner/_work/pytorch/pytorch. This will be the root directory of the artifact 2025-12-04T13:59:54.4440092Z With the provided path, there will be 84 files uploaded 2025-12-04T13:59:54.4443002Z Artifact name is valid! 2025-12-04T13:59:54.4443567Z Root directory input is valid! 2025-12-04T13:59:54.6704991Z Beginning upload of artifact content to blob storage 2025-12-04T13:59:55.3022576Z Uploaded bytes 800413 2025-12-04T13:59:55.3685064Z Finished uploading artifact content to blob storage! 2025-12-04T13:59:55.3686259Z SHA256 digest of uploaded artifact zip is b50b3de5526cf4546d5fc4e5d0d4023cc70603fbf368d13ce20e8b9aa165453d 2025-12-04T13:59:55.3687061Z Finalizing artifact upload 2025-12-04T13:59:55.5133067Z Artifact logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip.zip successfully finalized. Artifact ID 4764838526 2025-12-04T13:59:55.5134123Z Artifact logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57116213187.zip has been successfully uploaded! Final size is 800413 bytes. Artifact ID is 4764838526 2025-12-04T13:59:55.5138549Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922849170/artifacts/4764838526 2025-12-04T13:59:55.5264083Z ##[group]Run # shellcheck disable=SC2156 2025-12-04T13:59:55.5264313Z # shellcheck disable=SC2156 2025-12-04T13:59:55.5264600Z find . -iname "core.[1-9]*" -exec docker exec "${CONTAINER_NAME}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \; 2025-12-04T13:59:55.5269461Z shell: /usr/bin/bash -e {0} 2025-12-04T13:59:55.5269820Z env: 2025-12-04T13:59:55.5269930Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:55.5270083Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:55.5270273Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:55.5270451Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:55.5271013Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:55.5271517Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:55.5271714Z AWS_REGION: us-east-1 2025-12-04T13:59:55.5271886Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:55.5272057Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:55.5274059Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:55.5274243Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:55.5274436Z ##[endgroup] 2025-12-04T13:59:55.6561805Z ##[group]Run actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 2025-12-04T13:59:55.6562031Z with: 2025-12-04T13:59:55.6562188Z name: coredumps-distributed-2-3-linux.rocm.gpu.gfx942.4.b 2025-12-04T13:59:55.6562386Z retention-days: 14 2025-12-04T13:59:55.6562509Z if-no-files-found: ignore 2025-12-04T13:59:55.6562638Z path: ./**/core.[1-9]* 2025-12-04T13:59:55.6562764Z compression-level: 6 2025-12-04T13:59:55.6562887Z overwrite: false 2025-12-04T13:59:55.6563010Z include-hidden-files: false 2025-12-04T13:59:55.6563142Z env: 2025-12-04T13:59:55.6563245Z GIT_DEFAULT_BRANCH: main 2025-12-04T13:59:55.6563413Z RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts 2025-12-04T13:59:55.6563616Z RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results 2025-12-04T13:59:55.6563806Z RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs 2025-12-04T13:59:55.6564405Z GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD128 --device /dev/dri/renderD136 --device /dev/dri/renderD144 --device /dev/dri/renderD152 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host 2025-12-04T13:59:55.6564919Z AWS_DEFAULT_REGION: us-east-1 2025-12-04T13:59:55.6565045Z AWS_REGION: us-east-1 2025-12-04T13:59:55.6565237Z AWS_ACCESS_KEY_ID: *** 2025-12-04T13:59:55.6565400Z AWS_SECRET_ACCESS_KEY: *** 2025-12-04T13:59:55.6567387Z AWS_SESSION_TOKEN: *** 2025-12-04T13:59:55.6567570Z CONTAINER_NAME: 8ac2e1cca5c5a27e1632f345b696937e28036af305fa8c40ccee0a9f11fed68c 2025-12-04T13:59:55.6567761Z ##[endgroup] 2025-12-04T13:59:59.5985251Z No files were found with the provided path: ./**/core.[1-9]*. No artifacts will be uploaded. 2025-12-04T13:59:59.6150129Z Post job cleanup. 2025-12-04T13:59:59.6162629Z Post job cleanup. 2025-12-04T13:59:59.6365267Z Logging out of registry 308535385114.dkr.ecr.us-east-1.amazonaws.com 2025-12-04T13:59:59.6578813Z Post job cleanup. 2025-12-04T13:59:59.7209075Z Post job cleanup. 2025-12-04T13:59:59.7228715Z Post job cleanup. 2025-12-04T13:59:59.7696723Z [command]/usr/bin/git version 2025-12-04T13:59:59.7721749Z git version 2.52.0 2025-12-04T13:59:59.7741400Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/8fbd6a03-a64a-4207-be18-c4cca24ee4fc/.gitconfig' 2025-12-04T13:59:59.7746855Z Temporarily overriding HOME='/home/runner/_work/_temp/8fbd6a03-a64a-4207-be18-c4cca24ee4fc' before making global git config changes 2025-12-04T13:59:59.7747382Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T13:59:59.7749467Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T13:59:59.7775932Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T13:59:59.7809766Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T13:59:59.8005598Z Entering 'android/libs/fbjni' 2025-12-04T13:59:59.8031238Z Entering 'third_party/FP16' 2025-12-04T13:59:59.8056150Z Entering 'third_party/FXdiv' 2025-12-04T13:59:59.8079889Z Entering 'third_party/NNPACK' 2025-12-04T13:59:59.8106054Z Entering 'third_party/NVTX' 2025-12-04T13:59:59.8132378Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T13:59:59.8157656Z Entering 'third_party/XNNPACK' 2025-12-04T13:59:59.8193215Z Entering 'third_party/aiter' 2025-12-04T13:59:59.8227428Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T13:59:59.8260625Z Entering 'third_party/benchmark' 2025-12-04T13:59:59.8285979Z Entering 'third_party/composable_kernel' 2025-12-04T13:59:59.8315482Z Entering 'third_party/cpp-httplib' 2025-12-04T13:59:59.8339127Z Entering 'third_party/cpuinfo' 2025-12-04T13:59:59.8363053Z Entering 'third_party/cudnn_frontend' 2025-12-04T13:59:59.8386750Z Entering 'third_party/cutlass' 2025-12-04T13:59:59.8414976Z Entering 'third_party/fbgemm' 2025-12-04T13:59:59.8440668Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T13:59:59.8465175Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T13:59:59.8489840Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T13:59:59.8515039Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T13:59:59.8541667Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T13:59:59.8564353Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T13:59:59.8590582Z Entering 'third_party/fbgemm/external/json' 2025-12-04T13:59:59.8615532Z Entering 'third_party/flash-attention' 2025-12-04T13:59:59.8641024Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T13:59:59.8666718Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T13:59:59.8694611Z Entering 'third_party/flatbuffers' 2025-12-04T13:59:59.8718235Z Entering 'third_party/fmt' 2025-12-04T13:59:59.8749848Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T13:59:59.8774562Z Entering 'third_party/gloo' 2025-12-04T13:59:59.8798083Z Entering 'third_party/googletest' 2025-12-04T13:59:59.8821021Z Entering 'third_party/ideep' 2025-12-04T13:59:59.8848939Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T13:59:59.8886210Z Entering 'third_party/ittapi' 2025-12-04T13:59:59.8908576Z Entering 'third_party/kineto' 2025-12-04T13:59:59.8940942Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T13:59:59.8964546Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T13:59:59.8989125Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T13:59:59.9013652Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T13:59:59.9043544Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T13:59:59.9066955Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T13:59:59.9095825Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T13:59:59.9121779Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T13:59:59.9154117Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T13:59:59.9180367Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T13:59:59.9209056Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T13:59:59.9237564Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:59:59.9263473Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:59:59.9291172Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T13:59:59.9315686Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T13:59:59.9341892Z Entering 'third_party/kleidiai' 2025-12-04T13:59:59.9369334Z Entering 'third_party/mimalloc' 2025-12-04T13:59:59.9393346Z Entering 'third_party/nlohmann' 2025-12-04T13:59:59.9416671Z Entering 'third_party/onnx' 2025-12-04T13:59:59.9447395Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T13:59:59.9478034Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T13:59:59.9503670Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T13:59:59.9549832Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T13:59:59.9586420Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T13:59:59.9623588Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T13:59:59.9656375Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T13:59:59.9682132Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T13:59:59.9703412Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T13:59:59.9724363Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T13:59:59.9753397Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T13:59:59.9787804Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T13:59:59.9825814Z Entering 'third_party/pocketfft' 2025-12-04T13:59:59.9853858Z Entering 'third_party/protobuf' 2025-12-04T13:59:59.9883390Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T13:59:59.9904720Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T13:59:59.9930901Z Entering 'third_party/psimd' 2025-12-04T13:59:59.9960621Z Entering 'third_party/pthreadpool' 2025-12-04T13:59:59.9994579Z Entering 'third_party/pybind11' 2025-12-04T14:00:00.0020837Z Entering 'third_party/python-peachpy' 2025-12-04T14:00:00.0049882Z Entering 'third_party/sleef' 2025-12-04T14:00:00.0074861Z Entering 'third_party/tensorpipe' 2025-12-04T14:00:00.0101938Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T14:00:00.0130446Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T14:00:00.0159179Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T14:00:00.0183550Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T14:00:00.0208661Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T14:00:00.0254423Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T14:00:00.0270573Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0280317Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader 2025-12-04T14:00:00.0306815Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T14:00:00.0518885Z Entering 'android/libs/fbjni' 2025-12-04T14:00:00.0547773Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0572638Z Entering 'third_party/FP16' 2025-12-04T14:00:00.0587012Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0608465Z Entering 'third_party/FXdiv' 2025-12-04T14:00:00.0625486Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0644867Z Entering 'third_party/NNPACK' 2025-12-04T14:00:00.0663958Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0683448Z Entering 'third_party/NVTX' 2025-12-04T14:00:00.0696515Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0716717Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T14:00:00.0730075Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0749191Z Entering 'third_party/XNNPACK' 2025-12-04T14:00:00.0765614Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0789370Z Entering 'third_party/aiter' 2025-12-04T14:00:00.0803857Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0821354Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T14:00:00.0839085Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0861724Z Entering 'third_party/benchmark' 2025-12-04T14:00:00.0876148Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0896323Z Entering 'third_party/composable_kernel' 2025-12-04T14:00:00.0908678Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0931883Z Entering 'third_party/cpp-httplib' 2025-12-04T14:00:00.0945987Z http.https://github.com/.extraheader 2025-12-04T14:00:00.0963726Z Entering 'third_party/cpuinfo' 2025-12-04T14:00:00.0983445Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1004154Z Entering 'third_party/cudnn_frontend' 2025-12-04T14:00:00.1018310Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1037031Z Entering 'third_party/cutlass' 2025-12-04T14:00:00.1052160Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1075550Z Entering 'third_party/fbgemm' 2025-12-04T14:00:00.1090244Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1114302Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T14:00:00.1129366Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1158417Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T14:00:00.1176459Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1196476Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T14:00:00.1209486Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1226149Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T14:00:00.1245728Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1267491Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T14:00:00.1279642Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1296329Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T14:00:00.1308530Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1326519Z Entering 'third_party/fbgemm/external/json' 2025-12-04T14:00:00.1340544Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1360325Z Entering 'third_party/flash-attention' 2025-12-04T14:00:00.1373289Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1394922Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T14:00:00.1412321Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1431469Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T14:00:00.1446932Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1471540Z Entering 'third_party/flatbuffers' 2025-12-04T14:00:00.1485468Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1515010Z Entering 'third_party/fmt' 2025-12-04T14:00:00.1532655Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1561487Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T14:00:00.1587525Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1609398Z Entering 'third_party/gloo' 2025-12-04T14:00:00.1624277Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1644003Z Entering 'third_party/googletest' 2025-12-04T14:00:00.1659561Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1675538Z Entering 'third_party/ideep' 2025-12-04T14:00:00.1690420Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1705753Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T14:00:00.1718449Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1744378Z Entering 'third_party/ittapi' 2025-12-04T14:00:00.1757712Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1780069Z Entering 'third_party/kineto' 2025-12-04T14:00:00.1793097Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1815450Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T14:00:00.1830747Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1850108Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T14:00:00.1863751Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1883304Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T14:00:00.1896802Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1914785Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T14:00:00.1940535Z http.https://github.com/.extraheader 2025-12-04T14:00:00.1959987Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T14:00:00.1983836Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2001682Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T14:00:00.2015589Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2037388Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T14:00:00.2050914Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2068217Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T14:00:00.2083378Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2103931Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T14:00:00.2117501Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2135132Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T14:00:00.2151433Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2168928Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T14:00:00.2182544Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2198749Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:00.2216542Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2235884Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:00.2248893Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2271662Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T14:00:00.2285105Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2302169Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T14:00:00.2322157Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2339778Z Entering 'third_party/kleidiai' 2025-12-04T14:00:00.2354402Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2373092Z Entering 'third_party/mimalloc' 2025-12-04T14:00:00.2386934Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2408689Z Entering 'third_party/nlohmann' 2025-12-04T14:00:00.2422455Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2441538Z Entering 'third_party/onnx' 2025-12-04T14:00:00.2454291Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2475127Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T14:00:00.2490791Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2513363Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T14:00:00.2529984Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2547265Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T14:00:00.2572730Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2589833Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T14:00:00.2602681Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2621125Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T14:00:00.2633747Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2652633Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T14:00:00.2665774Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2682356Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T14:00:00.2696102Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2712216Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T14:00:00.2724195Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2741361Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T14:00:00.2752694Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2768609Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:00.2786126Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2805046Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:00.2819446Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2837827Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T14:00:00.2850964Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2879292Z Entering 'third_party/pocketfft' 2025-12-04T14:00:00.2894514Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2913787Z Entering 'third_party/protobuf' 2025-12-04T14:00:00.2927376Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2948220Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T14:00:00.2962806Z http.https://github.com/.extraheader 2025-12-04T14:00:00.2981097Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T14:00:00.2995607Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3014311Z Entering 'third_party/psimd' 2025-12-04T14:00:00.3028363Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3052330Z Entering 'third_party/pthreadpool' 2025-12-04T14:00:00.3064907Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3081908Z Entering 'third_party/pybind11' 2025-12-04T14:00:00.3097193Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3113899Z Entering 'third_party/python-peachpy' 2025-12-04T14:00:00.3127686Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3144236Z Entering 'third_party/sleef' 2025-12-04T14:00:00.3157742Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3182803Z Entering 'third_party/tensorpipe' 2025-12-04T14:00:00.3196886Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3215539Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T14:00:00.3228969Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3246033Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T14:00:00.3260444Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3279446Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T14:00:00.3291601Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3312498Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T14:00:00.3323777Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3343167Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T14:00:00.3356912Z http.https://github.com/.extraheader 2025-12-04T14:00:00.3396001Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.3419998Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T14:00:00.3587618Z Entering 'android/libs/fbjni' 2025-12-04T14:00:00.3601459Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T14:00:00.3610753Z Entering 'third_party/FP16' 2025-12-04T14:00:00.3621716Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T14:00:00.3633161Z Entering 'third_party/FXdiv' 2025-12-04T14:00:00.3643440Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T14:00:00.3655378Z Entering 'third_party/NNPACK' 2025-12-04T14:00:00.3666588Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T14:00:00.3676853Z Entering 'third_party/NVTX' 2025-12-04T14:00:00.3688555Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T14:00:00.3698492Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T14:00:00.3710486Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T14:00:00.3722083Z Entering 'third_party/XNNPACK' 2025-12-04T14:00:00.3734145Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T14:00:00.3749628Z Entering 'third_party/aiter' 2025-12-04T14:00:00.3760018Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T14:00:00.3770491Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T14:00:00.3780546Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T14:00:00.3800595Z Entering 'third_party/benchmark' 2025-12-04T14:00:00.3810704Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T14:00:00.3823795Z Entering 'third_party/composable_kernel' 2025-12-04T14:00:00.3833691Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T14:00:00.3846114Z Entering 'third_party/cpp-httplib' 2025-12-04T14:00:00.3865266Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T14:00:00.3872735Z Entering 'third_party/cpuinfo' 2025-12-04T14:00:00.3882580Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T14:00:00.3893414Z Entering 'third_party/cudnn_frontend' 2025-12-04T14:00:00.3903792Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T14:00:00.3912860Z Entering 'third_party/cutlass' 2025-12-04T14:00:00.3923558Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T14:00:00.3936781Z Entering 'third_party/fbgemm' 2025-12-04T14:00:00.3948625Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T14:00:00.3961095Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T14:00:00.3976211Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T14:00:00.3991282Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T14:00:00.4007801Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T14:00:00.4021847Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T14:00:00.4034413Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T14:00:00.4045318Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T14:00:00.4056867Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T14:00:00.4069149Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T14:00:00.4079371Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T14:00:00.4088326Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T14:00:00.4101780Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T14:00:00.4110725Z Entering 'third_party/fbgemm/external/json' 2025-12-04T14:00:00.4123990Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T14:00:00.4135317Z Entering 'third_party/flash-attention' 2025-12-04T14:00:00.4144790Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T14:00:00.4155091Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T14:00:00.4163399Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T14:00:00.4178651Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T14:00:00.4191599Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T14:00:00.4206073Z Entering 'third_party/flatbuffers' 2025-12-04T14:00:00.4216129Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T14:00:00.4228058Z Entering 'third_party/fmt' 2025-12-04T14:00:00.4238812Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T14:00:00.4254338Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T14:00:00.4264555Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T14:00:00.4273961Z Entering 'third_party/gloo' 2025-12-04T14:00:00.4283707Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T14:00:00.4294429Z Entering 'third_party/googletest' 2025-12-04T14:00:00.4306975Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:00.4315885Z Entering 'third_party/ideep' 2025-12-04T14:00:00.4325884Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T14:00:00.4333917Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T14:00:00.4349306Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T14:00:00.4369724Z Entering 'third_party/ittapi' 2025-12-04T14:00:00.4379813Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T14:00:00.4390030Z Entering 'third_party/kineto' 2025-12-04T14:00:00.4402853Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T14:00:00.4412184Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T14:00:00.4423280Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T14:00:00.4431582Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T14:00:00.4442332Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T14:00:00.4452039Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T14:00:00.4461710Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T14:00:00.4470212Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T14:00:00.4484285Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T14:00:00.4492946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T14:00:00.4502263Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T14:00:00.4510687Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T14:00:00.4523792Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T14:00:00.4535294Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T14:00:00.4548151Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T14:00:00.4560593Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T14:00:00.4571644Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:00.4591393Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T14:00:00.4607576Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T14:00:00.4621623Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T14:00:00.4636317Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T14:00:00.4645659Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T14:00:00.4659757Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T14:00:00.4670077Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:00.4681635Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T14:00:00.4696342Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:00.4708037Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T14:00:00.4723454Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T14:00:00.4733664Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T14:00:00.4742643Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T14:00:00.4755874Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T14:00:00.4769246Z Entering 'third_party/kleidiai' 2025-12-04T14:00:00.4781027Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T14:00:00.4792526Z Entering 'third_party/mimalloc' 2025-12-04T14:00:00.4803933Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T14:00:00.4812773Z Entering 'third_party/nlohmann' 2025-12-04T14:00:00.4824412Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T14:00:00.4838695Z Entering 'third_party/onnx' 2025-12-04T14:00:00.4849981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T14:00:00.4866487Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T14:00:00.4876852Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T14:00:00.4888190Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T14:00:00.4898669Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T14:00:00.4907762Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T14:00:00.4919554Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T14:00:00.4933780Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T14:00:00.4948308Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:00.4956832Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T14:00:00.4967627Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T14:00:00.4976811Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T14:00:00.4985546Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T14:00:00.4995949Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T14:00:00.5008051Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T14:00:00.5017059Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T14:00:00.5031691Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T14:00:00.5040843Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T14:00:00.5051043Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T14:00:00.5060724Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:00.5073103Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T14:00:00.5080960Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:00.5091056Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T14:00:00.5104311Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T14:00:00.5114196Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T14:00:00.5133092Z Entering 'third_party/pocketfft' 2025-12-04T14:00:00.5143458Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T14:00:00.5152564Z Entering 'third_party/protobuf' 2025-12-04T14:00:00.5163065Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T14:00:00.5173966Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T14:00:00.5183677Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T14:00:00.5193772Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T14:00:00.5203715Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:00.5218601Z Entering 'third_party/psimd' 2025-12-04T14:00:00.5232463Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T14:00:00.5242812Z Entering 'third_party/pthreadpool' 2025-12-04T14:00:00.5252772Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T14:00:00.5266317Z Entering 'third_party/pybind11' 2025-12-04T14:00:00.5275521Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T14:00:00.5285424Z Entering 'third_party/python-peachpy' 2025-12-04T14:00:00.5294835Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T14:00:00.5303514Z Entering 'third_party/sleef' 2025-12-04T14:00:00.5313718Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T14:00:00.5322712Z Entering 'third_party/tensorpipe' 2025-12-04T14:00:00.5332975Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T14:00:00.5342624Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T14:00:00.5352153Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:00.5362200Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T14:00:00.5376823Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T14:00:00.5386471Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T14:00:00.5397194Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T14:00:00.5409960Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T14:00:00.5429831Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T14:00:00.5439887Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T14:00:00.5449883Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T14:00:00.5479298Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5499261Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5515760Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5533393Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5552941Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5568089Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5583439Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5598202Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5612771Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5625381Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5640400Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5655073Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5671648Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5685991Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5701398Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5715547Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5735931Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5751058Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5766218Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5780103Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5795675Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5810682Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5823449Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5837038Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5851046Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5873535Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5888374Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5901416Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5916313Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5934975Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5954201Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5971337Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5984947Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.5999250Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6012754Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6026820Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6042323Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6058552Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6077377Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6092598Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6108546Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6123809Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6140576Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6154870Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6171063Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6186764Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6201391Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6215846Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6230528Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6246221Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6261530Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6275636Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6290477Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6308607Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6323574Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6339100Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6354255Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6370520Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6385408Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6401117Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6416175Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6430195Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6449877Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6467481Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6481931Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6494560Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6509200Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6524114Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6537485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6551737Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6566569Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6581234Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6596309Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6611721Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6625931Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6640687Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6655082Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6668346Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6682966Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6698253Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6718303Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:00.6808218Z Post job cleanup. 2025-12-04T14:00:00.7265560Z [command]/usr/bin/git version 2025-12-04T14:00:00.7294954Z git version 2.52.0 2025-12-04T14:00:00.7321881Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/9e0999af-01d0-4c57-a4e0-716bc118887e/.gitconfig' 2025-12-04T14:00:00.7328020Z Temporarily overriding HOME='/home/runner/_work/_temp/9e0999af-01d0-4c57-a4e0-716bc118887e' before making global git config changes 2025-12-04T14:00:00.7328359Z Adding repository directory to the temporary git global config as a safe directory 2025-12-04T14:00:00.7330832Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch 2025-12-04T14:00:00.7360757Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand 2025-12-04T14:00:00.7381163Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :" 2025-12-04T14:00:00.7580681Z Entering 'android/libs/fbjni' 2025-12-04T14:00:00.7612307Z Entering 'third_party/FP16' 2025-12-04T14:00:00.7641095Z Entering 'third_party/FXdiv' 2025-12-04T14:00:00.7663502Z Entering 'third_party/NNPACK' 2025-12-04T14:00:00.7694871Z Entering 'third_party/NVTX' 2025-12-04T14:00:00.7716991Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T14:00:00.7737471Z Entering 'third_party/XNNPACK' 2025-12-04T14:00:00.7762799Z Entering 'third_party/aiter' 2025-12-04T14:00:00.7787588Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T14:00:00.7825030Z Entering 'third_party/benchmark' 2025-12-04T14:00:00.7855277Z Entering 'third_party/composable_kernel' 2025-12-04T14:00:00.7885928Z Entering 'third_party/cpp-httplib' 2025-12-04T14:00:00.7908317Z Entering 'third_party/cpuinfo' 2025-12-04T14:00:00.7936584Z Entering 'third_party/cudnn_frontend' 2025-12-04T14:00:00.7963975Z Entering 'third_party/cutlass' 2025-12-04T14:00:00.7991750Z Entering 'third_party/fbgemm' 2025-12-04T14:00:00.8016328Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T14:00:00.8040607Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T14:00:00.8073420Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T14:00:00.8094515Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T14:00:00.8130070Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T14:00:00.8159919Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T14:00:00.8183322Z Entering 'third_party/fbgemm/external/json' 2025-12-04T14:00:00.8221098Z Entering 'third_party/flash-attention' 2025-12-04T14:00:00.8248947Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T14:00:00.8281874Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T14:00:00.8312498Z Entering 'third_party/flatbuffers' 2025-12-04T14:00:00.8348947Z Entering 'third_party/fmt' 2025-12-04T14:00:00.8375837Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T14:00:00.8398384Z Entering 'third_party/gloo' 2025-12-04T14:00:00.8425099Z Entering 'third_party/googletest' 2025-12-04T14:00:00.8455931Z Entering 'third_party/ideep' 2025-12-04T14:00:00.8480057Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T14:00:00.8510895Z Entering 'third_party/ittapi' 2025-12-04T14:00:00.8540447Z Entering 'third_party/kineto' 2025-12-04T14:00:00.8563057Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T14:00:00.8592484Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T14:00:00.8624576Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T14:00:00.8651200Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T14:00:00.8677753Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T14:00:00.8703010Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T14:00:00.8725389Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T14:00:00.8760163Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T14:00:00.8784474Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T14:00:00.8804616Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T14:00:00.8827197Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T14:00:00.8852375Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:00.8880881Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:00.8914770Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T14:00:00.8949291Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T14:00:00.8980506Z Entering 'third_party/kleidiai' 2025-12-04T14:00:00.9009511Z Entering 'third_party/mimalloc' 2025-12-04T14:00:00.9036091Z Entering 'third_party/nlohmann' 2025-12-04T14:00:00.9064010Z Entering 'third_party/onnx' 2025-12-04T14:00:00.9099366Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T14:00:00.9128366Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T14:00:00.9165159Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T14:00:00.9190715Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T14:00:00.9217720Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T14:00:00.9246329Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T14:00:00.9271923Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T14:00:00.9299232Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T14:00:00.9327626Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T14:00:00.9358186Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:00.9387005Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:00.9422514Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T14:00:00.9460368Z Entering 'third_party/pocketfft' 2025-12-04T14:00:00.9488397Z Entering 'third_party/protobuf' 2025-12-04T14:00:00.9515385Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T14:00:00.9540803Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T14:00:00.9568977Z Entering 'third_party/psimd' 2025-12-04T14:00:00.9596211Z Entering 'third_party/pthreadpool' 2025-12-04T14:00:00.9623484Z Entering 'third_party/pybind11' 2025-12-04T14:00:00.9653626Z Entering 'third_party/python-peachpy' 2025-12-04T14:00:00.9676706Z Entering 'third_party/sleef' 2025-12-04T14:00:00.9702096Z Entering 'third_party/tensorpipe' 2025-12-04T14:00:00.9728238Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T14:00:00.9750288Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T14:00:00.9774082Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T14:00:00.9797371Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T14:00:00.9821479Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T14:00:00.9867057Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader 2025-12-04T14:00:00.9896274Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :" 2025-12-04T14:00:01.0085973Z Entering 'android/libs/fbjni' 2025-12-04T14:00:01.0117839Z Entering 'third_party/FP16' 2025-12-04T14:00:01.0141427Z Entering 'third_party/FXdiv' 2025-12-04T14:00:01.0165385Z Entering 'third_party/NNPACK' 2025-12-04T14:00:01.0185955Z Entering 'third_party/NVTX' 2025-12-04T14:00:01.0210042Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T14:00:01.0231567Z Entering 'third_party/XNNPACK' 2025-12-04T14:00:01.0260562Z Entering 'third_party/aiter' 2025-12-04T14:00:01.0289253Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T14:00:01.0318547Z Entering 'third_party/benchmark' 2025-12-04T14:00:01.0348652Z Entering 'third_party/composable_kernel' 2025-12-04T14:00:01.0377309Z Entering 'third_party/cpp-httplib' 2025-12-04T14:00:01.0405551Z Entering 'third_party/cpuinfo' 2025-12-04T14:00:01.0429725Z Entering 'third_party/cudnn_frontend' 2025-12-04T14:00:01.0462235Z Entering 'third_party/cutlass' 2025-12-04T14:00:01.0495703Z Entering 'third_party/fbgemm' 2025-12-04T14:00:01.0526622Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T14:00:01.0549423Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T14:00:01.0578510Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T14:00:01.0601361Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T14:00:01.0633616Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T14:00:01.0657464Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T14:00:01.0678834Z Entering 'third_party/fbgemm/external/json' 2025-12-04T14:00:01.0704524Z Entering 'third_party/flash-attention' 2025-12-04T14:00:01.0728555Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T14:00:01.0752568Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T14:00:01.0786539Z Entering 'third_party/flatbuffers' 2025-12-04T14:00:01.0810895Z Entering 'third_party/fmt' 2025-12-04T14:00:01.0831628Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T14:00:01.0853554Z Entering 'third_party/gloo' 2025-12-04T14:00:01.0876784Z Entering 'third_party/googletest' 2025-12-04T14:00:01.0903466Z Entering 'third_party/ideep' 2025-12-04T14:00:01.0928685Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T14:00:01.0957182Z Entering 'third_party/ittapi' 2025-12-04T14:00:01.0983789Z Entering 'third_party/kineto' 2025-12-04T14:00:01.1017803Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T14:00:01.1040285Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T14:00:01.1064164Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T14:00:01.1086450Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T14:00:01.1116208Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T14:00:01.1144202Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T14:00:01.1174655Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T14:00:01.1195406Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T14:00:01.1222665Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T14:00:01.1245630Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T14:00:01.1271791Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T14:00:01.1295948Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:01.1323085Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:01.1356164Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T14:00:01.1379836Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T14:00:01.1405863Z Entering 'third_party/kleidiai' 2025-12-04T14:00:01.1432733Z Entering 'third_party/mimalloc' 2025-12-04T14:00:01.1458486Z Entering 'third_party/nlohmann' 2025-12-04T14:00:01.1484294Z Entering 'third_party/onnx' 2025-12-04T14:00:01.1513811Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T14:00:01.1539430Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T14:00:01.1562738Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T14:00:01.1589666Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T14:00:01.1613033Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T14:00:01.1635657Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T14:00:01.1655964Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T14:00:01.1675438Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T14:00:01.1698970Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T14:00:01.1724167Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:01.1749204Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:01.1779191Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T14:00:01.1809153Z Entering 'third_party/pocketfft' 2025-12-04T14:00:01.1836045Z Entering 'third_party/protobuf' 2025-12-04T14:00:01.1860975Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T14:00:01.1884603Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T14:00:01.1916142Z Entering 'third_party/psimd' 2025-12-04T14:00:01.1941064Z Entering 'third_party/pthreadpool' 2025-12-04T14:00:01.1963268Z Entering 'third_party/pybind11' 2025-12-04T14:00:01.1993411Z Entering 'third_party/python-peachpy' 2025-12-04T14:00:01.2014011Z Entering 'third_party/sleef' 2025-12-04T14:00:01.2035389Z Entering 'third_party/tensorpipe' 2025-12-04T14:00:01.2059891Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T14:00:01.2095049Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T14:00:01.2124191Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T14:00:01.2146211Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T14:00:01.2171862Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T14:00:01.2218029Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.2241262Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url 2025-12-04T14:00:01.2411983Z Entering 'android/libs/fbjni' 2025-12-04T14:00:01.2423391Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config remote.origin.url 2025-12-04T14:00:01.2432711Z Entering 'third_party/FP16' 2025-12-04T14:00:01.2443459Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config remote.origin.url 2025-12-04T14:00:01.2455928Z Entering 'third_party/FXdiv' 2025-12-04T14:00:01.2467693Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config remote.origin.url 2025-12-04T14:00:01.2475881Z Entering 'third_party/NNPACK' 2025-12-04T14:00:01.2484957Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config remote.origin.url 2025-12-04T14:00:01.2495489Z Entering 'third_party/NVTX' 2025-12-04T14:00:01.2505313Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config remote.origin.url 2025-12-04T14:00:01.2514269Z Entering 'third_party/VulkanMemoryAllocator' 2025-12-04T14:00:01.2523487Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config remote.origin.url 2025-12-04T14:00:01.2531974Z Entering 'third_party/XNNPACK' 2025-12-04T14:00:01.2543262Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config remote.origin.url 2025-12-04T14:00:01.2563622Z Entering 'third_party/aiter' 2025-12-04T14:00:01.2574471Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config remote.origin.url 2025-12-04T14:00:01.2586128Z Entering 'third_party/aiter/3rdparty/composable_kernel' 2025-12-04T14:00:01.2597440Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config remote.origin.url 2025-12-04T14:00:01.2610733Z Entering 'third_party/benchmark' 2025-12-04T14:00:01.2620756Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config remote.origin.url 2025-12-04T14:00:01.2630191Z Entering 'third_party/composable_kernel' 2025-12-04T14:00:01.2646397Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config remote.origin.url 2025-12-04T14:00:01.2660285Z Entering 'third_party/cpp-httplib' 2025-12-04T14:00:01.2674239Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config remote.origin.url 2025-12-04T14:00:01.2683510Z Entering 'third_party/cpuinfo' 2025-12-04T14:00:01.2693837Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config remote.origin.url 2025-12-04T14:00:01.2702720Z Entering 'third_party/cudnn_frontend' 2025-12-04T14:00:01.2713937Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config remote.origin.url 2025-12-04T14:00:01.2724457Z Entering 'third_party/cutlass' 2025-12-04T14:00:01.2734283Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config remote.origin.url 2025-12-04T14:00:01.2747219Z Entering 'third_party/fbgemm' 2025-12-04T14:00:01.2756289Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config remote.origin.url 2025-12-04T14:00:01.2765918Z Entering 'third_party/fbgemm/external/asmjit' 2025-12-04T14:00:01.2777979Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config remote.origin.url 2025-12-04T14:00:01.2786944Z Entering 'third_party/fbgemm/external/composable_kernel' 2025-12-04T14:00:01.2797277Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config remote.origin.url 2025-12-04T14:00:01.2810420Z Entering 'third_party/fbgemm/external/cpuinfo' 2025-12-04T14:00:01.2820057Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config remote.origin.url 2025-12-04T14:00:01.2828725Z Entering 'third_party/fbgemm/external/cutlass' 2025-12-04T14:00:01.2838288Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config remote.origin.url 2025-12-04T14:00:01.2850693Z Entering 'third_party/fbgemm/external/googletest' 2025-12-04T14:00:01.2860624Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config remote.origin.url 2025-12-04T14:00:01.2869189Z Entering 'third_party/fbgemm/external/hipify_torch' 2025-12-04T14:00:01.2878176Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config remote.origin.url 2025-12-04T14:00:01.2888394Z Entering 'third_party/fbgemm/external/json' 2025-12-04T14:00:01.2896686Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config remote.origin.url 2025-12-04T14:00:01.2908014Z Entering 'third_party/flash-attention' 2025-12-04T14:00:01.2918295Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config remote.origin.url 2025-12-04T14:00:01.2927301Z Entering 'third_party/flash-attention/csrc/composable_kernel' 2025-12-04T14:00:01.2936447Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config remote.origin.url 2025-12-04T14:00:01.2947219Z Entering 'third_party/flash-attention/csrc/cutlass' 2025-12-04T14:00:01.2956565Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config remote.origin.url 2025-12-04T14:00:01.2975962Z Entering 'third_party/flatbuffers' 2025-12-04T14:00:01.2990053Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config remote.origin.url 2025-12-04T14:00:01.3000444Z Entering 'third_party/fmt' 2025-12-04T14:00:01.3011285Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config remote.origin.url 2025-12-04T14:00:01.3021975Z Entering 'third_party/gemmlowp/gemmlowp' 2025-12-04T14:00:01.3033124Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config remote.origin.url 2025-12-04T14:00:01.3047177Z Entering 'third_party/gloo' 2025-12-04T14:00:01.3057004Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config remote.origin.url 2025-12-04T14:00:01.3065464Z Entering 'third_party/googletest' 2025-12-04T14:00:01.3074549Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:01.3083092Z Entering 'third_party/ideep' 2025-12-04T14:00:01.3094587Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config remote.origin.url 2025-12-04T14:00:01.3102574Z Entering 'third_party/ideep/mkl-dnn' 2025-12-04T14:00:01.3121814Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config remote.origin.url 2025-12-04T14:00:01.3143208Z Entering 'third_party/ittapi' 2025-12-04T14:00:01.3154114Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config remote.origin.url 2025-12-04T14:00:01.3163127Z Entering 'third_party/kineto' 2025-12-04T14:00:01.3174486Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config remote.origin.url 2025-12-04T14:00:01.3184875Z Entering 'third_party/kineto/libkineto/third_party/dynolog' 2025-12-04T14:00:01.3202399Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config remote.origin.url 2025-12-04T14:00:01.3211775Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM' 2025-12-04T14:00:01.3223758Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config remote.origin.url 2025-12-04T14:00:01.3236890Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr' 2025-12-04T14:00:01.3247690Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config remote.origin.url 2025-12-04T14:00:01.3255836Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt' 2025-12-04T14:00:01.3264648Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config remote.origin.url 2025-12-04T14:00:01.3272833Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags' 2025-12-04T14:00:01.3283751Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config remote.origin.url 2025-12-04T14:00:01.3292818Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc' 2025-12-04T14:00:01.3303385Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config remote.origin.url 2025-12-04T14:00:01.3314099Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog' 2025-12-04T14:00:01.3325519Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config remote.origin.url 2025-12-04T14:00:01.3333847Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest' 2025-12-04T14:00:01.3343831Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:01.3356148Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json' 2025-12-04T14:00:01.3369303Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config remote.origin.url 2025-12-04T14:00:01.3381352Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs' 2025-12-04T14:00:01.3390234Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config remote.origin.url 2025-12-04T14:00:01.3398994Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp' 2025-12-04T14:00:01.3409611Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T14:00:01.3417650Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:01.3427428Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T14:00:01.3442287Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:01.3452227Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T14:00:01.3464990Z Entering 'third_party/kineto/libkineto/third_party/fmt' 2025-12-04T14:00:01.3474738Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config remote.origin.url 2025-12-04T14:00:01.3482502Z Entering 'third_party/kineto/libkineto/third_party/googletest' 2025-12-04T14:00:01.3492838Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config remote.origin.url 2025-12-04T14:00:01.3503282Z Entering 'third_party/kleidiai' 2025-12-04T14:00:01.3514007Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config remote.origin.url 2025-12-04T14:00:01.3523459Z Entering 'third_party/mimalloc' 2025-12-04T14:00:01.3533336Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config remote.origin.url 2025-12-04T14:00:01.3542244Z Entering 'third_party/nlohmann' 2025-12-04T14:00:01.3552139Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config remote.origin.url 2025-12-04T14:00:01.3561955Z Entering 'third_party/onnx' 2025-12-04T14:00:01.3571779Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config remote.origin.url 2025-12-04T14:00:01.3587375Z Entering 'third_party/onnx/third_party/pybind11' 2025-12-04T14:00:01.3597372Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config remote.origin.url 2025-12-04T14:00:01.3609642Z Entering 'third_party/opentelemetry-cpp' 2025-12-04T14:00:01.3620500Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config remote.origin.url 2025-12-04T14:00:01.3632175Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark' 2025-12-04T14:00:01.3644368Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config remote.origin.url 2025-12-04T14:00:01.3653579Z Entering 'third_party/opentelemetry-cpp/third_party/googletest' 2025-12-04T14:00:01.3664752Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:01.3673292Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl' 2025-12-04T14:00:01.3684955Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config remote.origin.url 2025-12-04T14:00:01.3694302Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json' 2025-12-04T14:00:01.3703881Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config remote.origin.url 2025-12-04T14:00:01.3712322Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto' 2025-12-04T14:00:01.3722378Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config remote.origin.url 2025-12-04T14:00:01.3731197Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp' 2025-12-04T14:00:01.3740754Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config remote.origin.url 2025-12-04T14:00:01.3749465Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp' 2025-12-04T14:00:01.3761194Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config remote.origin.url 2025-12-04T14:00:01.3769882Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb' 2025-12-04T14:00:01.3785732Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config remote.origin.url 2025-12-04T14:00:01.3795651Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest' 2025-12-04T14:00:01.3811076Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config remote.origin.url 2025-12-04T14:00:01.3822271Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg' 2025-12-04T14:00:01.3833679Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config remote.origin.url 2025-12-04T14:00:01.3852323Z Entering 'third_party/pocketfft' 2025-12-04T14:00:01.3862582Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config remote.origin.url 2025-12-04T14:00:01.3871258Z Entering 'third_party/protobuf' 2025-12-04T14:00:01.3882028Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config remote.origin.url 2025-12-04T14:00:01.3896527Z Entering 'third_party/protobuf/third_party/benchmark' 2025-12-04T14:00:01.3912371Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config remote.origin.url 2025-12-04T14:00:01.3926442Z Entering 'third_party/protobuf/third_party/googletest' 2025-12-04T14:00:01.3937375Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:01.3951868Z Entering 'third_party/psimd' 2025-12-04T14:00:01.3963196Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config remote.origin.url 2025-12-04T14:00:01.3974526Z Entering 'third_party/pthreadpool' 2025-12-04T14:00:01.3985530Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config remote.origin.url 2025-12-04T14:00:01.3999878Z Entering 'third_party/pybind11' 2025-12-04T14:00:01.4015859Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config remote.origin.url 2025-12-04T14:00:01.4025089Z Entering 'third_party/python-peachpy' 2025-12-04T14:00:01.4036941Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config remote.origin.url 2025-12-04T14:00:01.4046981Z Entering 'third_party/sleef' 2025-12-04T14:00:01.4058428Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config remote.origin.url 2025-12-04T14:00:01.4066716Z Entering 'third_party/tensorpipe' 2025-12-04T14:00:01.4076465Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config remote.origin.url 2025-12-04T14:00:01.4084645Z Entering 'third_party/tensorpipe/third_party/googletest' 2025-12-04T14:00:01.4102676Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config remote.origin.url 2025-12-04T14:00:01.4112249Z Entering 'third_party/tensorpipe/third_party/libnop' 2025-12-04T14:00:01.4122634Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config remote.origin.url 2025-12-04T14:00:01.4136942Z Entering 'third_party/tensorpipe/third_party/libuv' 2025-12-04T14:00:01.4146789Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config remote.origin.url 2025-12-04T14:00:01.4159700Z Entering 'third_party/tensorpipe/third_party/pybind11' 2025-12-04T14:00:01.4177009Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config remote.origin.url 2025-12-04T14:00:01.4184424Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang' 2025-12-04T14:00:01.4199346Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config remote.origin.url 2025-12-04T14:00:01.4229876Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4252363Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4267646Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4285693Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4304875Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4324356Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4338738Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4357772Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4374284Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4388694Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4403065Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4416487Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4436271Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4450225Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4464597Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4478735Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4492757Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4505605Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4519772Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4536433Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4554033Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4572966Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4593140Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4607738Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4627991Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4643053Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4662746Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4676756Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4692279Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4705097Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4718497Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4731192Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4745052Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4758100Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4778737Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4792371Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4806855Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4821933Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4837111Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4852556Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4868950Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4891591Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4906473Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4923575Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4938780Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4952440Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4965740Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4978756Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.4992217Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5012445Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5026442Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5039532Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5054737Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5071239Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5085587Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5105128Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5119857Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5135083Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5153858Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5179253Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5195657Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5210391Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5228298Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5243092Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5256989Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5280561Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5301396Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5315984Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5331151Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5350423Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5368063Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5385429Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5406535Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5429385Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5445910Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5461890Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5476675Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5497839Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5512607Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5527484Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5542565Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir: 2025-12-04T14:00:01.5650148Z Cleaning up orphan processes